Skip to content

Conversation

@goldsborough
Copy link
Contributor

@goldsborough goldsborough commented Sep 10, 2018

This PR:

  1. Documents BatchNorm,
  2. Makes a number of API changes after reconsidering some quirks:
    1. The default value for the stateful parameter used to be false, but the most common usage of BatchNorm out of the wild is certainly stateful, and the default in Python is also statefulness. So we change the default to stateful.
    2. The pure_forward function used to use the internal running mean and variance variables instead of the ones supplied to that function call when stateful was true, which certainly seems odd. When you call pure_forward you would certainly expect the values you pass explicitly to be used. This is now fixed.
  3. Adds tests for BatchNorm, finally.

@ebetica @apaszke @ezyang

@ssnl
Copy link
Collaborator

ssnl commented Sep 10, 2018

Now do we want the momentum parameter to be forever wrongly named? Or do we want to make it do the correct thing in c++ api?

@goldsborough
Copy link
Contributor Author

@ssnl I'm open to suggestions. I think it would be ok to give it a different name, especially because I don't imagine people changing it around too much typically. Maybe decay? Or gamma (like in ExponentialLR?
CC @ebetica

@ssnl
Copy link
Collaborator

ssnl commented Sep 10, 2018

gamma has a special meaning in BN, but decay sounds reasonable. fwiw, cudnn names it exponentialAverageFactor.

@goldsborough
Copy link
Contributor Author

@pytorchbot retest this please

@ebetica
Copy link
Contributor

ebetica commented Sep 11, 2018

I don't have a good opinion about the momentum parameter, I might defer this to some of the vision folks? I think it might be a discussion for a future PR in any case.

@ebetica
Copy link
Contributor

ebetica commented Sep 11, 2018

@prigoyal said that momentum is a universally well understood term so renaming it here probably doesn't make too much sense.

Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why you need code owner review on this one.

Copy link
Contributor

@soumith soumith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not diverge the PyTorch and C++ APIs. If we have BatchNorm momentum at the python level, for better or worse we stick to it.
API divergence is a much worse experience than academic correctness.

@goldsborough goldsborough dismissed soumith’s stale review September 11, 2018 21:59

Addressed in latest revert commit

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

goldsborough has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

goldsborough has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

petrex pushed a commit to petrex/pytorch that referenced this pull request Sep 12, 2018
* master: (165 commits)
  Aibench for asr decoder
  Explicitly set locale on docs build. (pytorch#11595)
  Documentation for debugging JIT
  Fused weightnorm for ATen (pytorch#10842)
  Move Type, Tensor, TensorMethods to core.
  Add reminder % to the jit
  Fix reloading modules back into python (pytorch#11552)
  Add trigonometry functions to docs/source/onnx.rst
  Add EndToEndHybridModel CUDA tests (pytorch#11544)
  minor formatting error log (pytorch#11528)
  Warn that export+import module always load onto the CPU (pytorch#11485)
  caffe2::StorageImpl use at::DataPtr (pytorch#11282)
  Sync all libnccl soversions, not just libnccl.so.1 (pytorch#11575)
  Document BatchNorm and update default behavior (pytorch#11484)
  Typo fix in randomness.rst (pytorch#11571)
  Move some bmm/baddbmm to ATen (pytorch#11292)
  Make c10d test work on CPU only build (pytorch#11567)
  Clean up some C++ cruftiness in the script lexer.
  Allow setting deletion constant
  Make C10d support CPU only build (pytorch#11513)
  ...
@ezyang ezyang added the merged label Jun 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants