Skip to content

Conversation

@goldsborough
Copy link
Contributor

This PR is a re-work of the final part of autogradpp I had not yet touched: the optimization parts.

Things I did:

  1. Split up torch/csrc/api/include/optimizers.h into a folder with the Optimizer base class and the concrete optimizers in separate files,
  2. Optimizers now take a vector of parameters instead of a model. This is in line with PyTorch.
  3. Differentiate between optimizers that require a loss closure (e.g. LBFGS) and those that do not (e.g. SGD, Adam, all others)
  4. Wrote a Python script to run the equivalent PyTorch optimization algorithms on a certain model setup and dump the parameter values every 100 steps to a file. Then wrote the equivalent model setup in C++ and now am comparing the exact parameter values every 100 steps. This is the safest way to ensure the correctness of these algorithms. Happy!!! I'll have to see how flaky this is.

Question: Should optimizers also have reference semantics like modules? Optimizers will hardly ever be subclassed, and would likely be instantiated at the point of use. Maybe not necessary, but happy to hear thoughts.

@ebetica @apaszke @ezyang

@ebetica
Copy link
Contributor

ebetica commented Jun 25, 2018

LGTM. I don't have any opinions about the reference semantic-ness of optimizers. I think it's okay to not have them be references, as long as they can't be easily accidentally copied.

@goldsborough goldsborough force-pushed the optim branch 4 times, most recently from b530cd5 to 7fd55f4 Compare June 26, 2018 04:11
@goldsborough goldsborough merged commit 1f36cac into pytorch:master Jun 26, 2018
@goldsborough goldsborough deleted the optim branch June 26, 2018 17:13
petrex pushed a commit to ROCm/pytorch that referenced this pull request Jun 26, 2018
* upstream/master: (42 commits)
  [c10d] No default device for ProcessGroupGloo (pytorch#8888)
  Fix default values for affine= in the docstrings of InstanceNormXd (pytorch#8895)
  Stop making dynamic allocations of PinnedMemoryAllocator. (pytorch#8896)
  [C++ API]  Rework optimization package (pytorch#8815)
  Mention MPICH_MAX_THREAD_SAFETY=multiple. (pytorch#8580)
  Unify isViewable, handle n-dimensional empty tensors. (pytorch#8883)
  Add pos_weight argument to nn.BCEWithLogitsLoss (pytorch#5660) (pytorch#6856)
  [build] Enable clang-specific warnings only when using clang (pytorch#8869)
  Fix cmake cudnn autodetection (pytorch#8891)
  [c10d] Fix link order for building C++ tests (pytorch#8889)
  directly add_subdirectory(nanopb) from torch CMakeLists (pytorch#8870)
  [C++ API] Bag of fixes (pytorch#8843)
  [build] Raise in cmake when seeing NVCC{9/9.1} + GCC6 combo (pytorch#8863)
  Create avg_pool1d in ATen (pytorch#8880)
  throw error when grid_sample is passed unsupported mode (pytorch#8884)
  Allow autograd to work even when the shape of values cannot be determined (pytorch#8641)
  Make at::Tensor::to() const (pytorch#8839)
  [auto] Update onnx to 458c521 - Fix typo (onnx/onnx#1143) onnx/onnx@458c521
  [Caffe2] Fix gradient_check on in-place ops (pytorch#8828)
  Fix as_strided_backward (pytorch#8721)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants