Conversation
…lder into each Optimizer. Also, added to each Optimizer a corresponding Tensor that holds the value of the learning rate, and added a feed dictionary that maps the placeholder to the Tensor, so that it can be fed into the runner when running or evaluating. When setLearning rate is called the learning rate tensor and the feed dictionary are updated.
…lder into each Optimizer. Also, added to each Optimizer a corresponding Tensor that holds the value of the learning rate, and added a feed dictionary that maps the placeholder to the Tensor, so that it can be fed into the runner when running or evaluating. When setLearning rate is called the learning rate tensor and the feed dictionary are updated.
deansher
left a comment
There was a problem hiding this comment.
A couple of quick observations as I start reading this code.
| } | ||
|
|
||
| /** Returns true if this data type represents a floating point type */ | ||
| public boolean isFloating() { |
There was a problem hiding this comment.
This pattern is very uncomfortable to me: DataType being omniscient about TType and dispatching on a string NAME. What's our motivation? If we think it's the best pattern for this situation, perhaps we could document why?
There was a problem hiding this comment.
Never mind, in this context -- I see in my local diff that this delta is unrelated to this PR. I'll raise this as an issue.
| * @param graph the TensorFlow Graph | ||
| * @param name the name for this Optimizer (defaults to 'Adadelta') | ||
| * @param learningRate the learning rate | ||
| */ | ||
| public AdaDelta(Graph graph, String name, float learningRate) { | ||
| this(graph, name, learningRate, 0.95f, 1e-8f); |
There was a problem hiding this comment.
-> RHO_DEFAULT, EPSILON_DEFAULT
There was a problem hiding this comment.
Never mind, in this context.
| * @param learningRate the learning rate. | ||
| */ | ||
| protected Optimizer(Graph graph, String name) { | ||
| protected Optimizer(Graph graph, float learningRate) { |
There was a problem hiding this comment.
Can we have both of these constructors call into the Optimizer(Graph,float,String) one?
There was a problem hiding this comment.
Being that Optimizer is abstract, we really only need one constructor, protected Optimizer(Graph graph, String name, float learningRate) . Of course, we would have to handle a null name, with something like:
this.tf = Ops.create(graph).withName(name == null? getOptimizerName() : name);
There was a problem hiding this comment.
CTORS have been changed
| return variable.op().name() + "-" + slotName; | ||
| } | ||
|
|
||
| /** |
There was a problem hiding this comment.
Why'd the Javadoc go away?
There was a problem hiding this comment.
I am not sure what happened. I had a local copy that I saved and it was there, so will add it back in.
| } | ||
| } | ||
|
|
||
| /** |
There was a problem hiding this comment.
I have added it back in. Update pushed
| * @param learningRate the learning rate | ||
| */ | ||
| public final void setLearningRate(float learningRate) { | ||
| if (this.learningRatePlaceholder == null) { |
There was a problem hiding this comment.
Everything seems to have grown a this reference. I don't think that's particularly necessary in these methods, as the argument could be newLearningRate rather than learningRate and then there is no aliasing.
There was a problem hiding this comment.
I do this out of habit. I can easily change it as you suggest.
There was a problem hiding this comment.
changed setLearningRate to setLearningRate(float newLearningRate), removed spurious this..
Update pushed
…ewLearningRate), eliminated spurious "this."
| * Sets the learning rate | ||
| * | ||
| * @param learningRate the learning rate | ||
| * @param newLearningRate the new earning rate |
tensorflow-framework/src/main/java/org/tensorflow/framework/optimizers/Optimizer.java
Show resolved
Hide resolved
…raph graph, String name, float learningRate)"", change all the subclass ctors to use this one.
Add Operand<TFloat32> learningRateOperand as an option for learning rate.
Craigacp
left a comment
There was a problem hiding this comment.
Just a few snake_case variable names in the tests that need converting to camelCase, and then I'll merge this in.
| new RMSProp(session.getGraph(), learningRate, decay, momentum, epsilon, centered)) { | ||
| Ops tf = session.getTF(); | ||
| session.setEpsilon(1e-2f); | ||
| float[] var0_init = {1.0F, 2.0F}; |
There was a problem hiding this comment.
Please switch the python style variable names to camelCase.
| private FloatNdArray calculateParam( | ||
| FloatNdArray param, float lrT, FloatNdArray m, FloatNdArray v, float epsilon) { | ||
| // param - lrT * mT / (np.sqrt(vT) + epsilon) | ||
| FloatNdArray param, float lr_t, FloatNdArray m, FloatNdArray v, float epsilon) { |
There was a problem hiding this comment.
Switch python style name to camelCase.
|
This PR requires some rework due to #174, |
Whatever you think is easiest is fine by me. |
|
I am closing for now, and will reopen after we get further along on Model. |
This PR requires PR "Initial checkin of Keras Optimzers and helper classes" to be merged first.
Added changeable learning rate to Optimizers. This was done by adding a Placeholder for the learning rate, a Tensor to track the actual learning rate, and adding a Map to map the Placeholder to the Tensor that can be used to "feed" the runner.
Test Sessions were modified to accept a "FeedDict" Map to popullate the feed() of the runner.