Skip to content

Feature pruning#106

Merged
cowchipkid merged 5 commits intomasterfrom
FeaturePruning
Sep 12, 2017
Merged

Feature pruning#106
cowchipkid merged 5 commits intomasterfrom
FeaturePruning

Conversation

@cowchipkid
Copy link
Copy Markdown
Contributor

The feature pruning implementation, prunes low value features for SVM, LTU subclasses and sparse nets.


// save the final model.
System.out.println("Writing " + getName());
learner.save(); // Doesn't write .lex if lexicon is empty.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cowchipkid can you clarify the change here? Confused about what's happening here ....

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the Accuracy reporting outside the block to ensure doneTraining() gets called before accuracy reporting. doneTraining will apply the feature optimization, we want the score after that. If accuracy is reported before doneTraining, we will get the score of the un-optimized models, which would be for not.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Makes sense.

@cowchipkid cowchipkid requested a review from mssammon August 25, 2017 16:29
continue;
}
double wt = ltu.getWeightVector().getRawWeights().get(fi);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, this is wrong. We need to call hasWeight here I think, this won't work for SparseAveragedPerceptron where we need to sum the past average with the actual weight.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it doesn't even address absolute zero.

@cowchipkid
Copy link
Copy Markdown
Contributor Author

The documentation in the feature pruning stuff is in the package-info.java in the package with all the optimization stuff. It was was already in place, I enhanced it a bit, and added some links into the life cycle methods. I also fixed one bug that likely had no impact. It would not disable pruning when you would set the threshold to zero.

for (int i = 0; i < indexes.length; ++i) {
Feature f = inverse.get(indexes[i]);
previousClassName =
previousClassName =
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop the extra space?

/** Default for {@link #bias}. */
public static final double defaultBias = 1.0;
/** any weight less than this is considered irrelevant. This is for prunning. */
public static final double defaultFeaturePruningThreshold = 0.000001;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this always used in an absolute sense? I.e. it always have to be positive.
In other words, if its zero, no pruning will be done.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danyaljj setting the threshold to zero is how you disable pruning, effectively.

@danyaljj
Copy link
Copy Markdown
Member

danyaljj commented Sep 8, 2017

A minor comment; Looks good to me!
Unless @mssammon plans to go through this, feel free to merge it.

@mssammon
Copy link
Copy Markdown
Contributor

mssammon commented Sep 9, 2017

am still in the thick of post-move activity; feel free to merge.

@cowchipkid cowchipkid self-assigned this Sep 12, 2017
@cowchipkid cowchipkid merged commit 32587db into master Sep 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants