-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Enable resetting of batchnorm running stats and cumulative ("simple") moving average #5766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@pytorchbot retest this please |
|
(for some possibly confusing chicken-scratch motivation of why CMA is needed, see https://www.overleaf.com/read/xydtqhmhmtdz) |
|
@pytorchbot retest this please |
|
Our CI was having issues, but it now works again and the failure is on your side. Please fix your PR :) Looks like it's something in the jit. |
|
OK, wasn't sure on that -- thanks for confirming! Looks like I didn't update some JIT unit test expected value, fixed now. |
|
@pytorchbot retest this please |
1 similar comment
|
@pytorchbot retest this please |
ssnl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good in general except for minor comments. after fixing and rebasing this should be mergeable.
torch/nn/modules/batchnorm.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/modules/batchnorm.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
It would also be great if we can test the arithmetic average |
|
@pytorchbot retest this please |
1 similar comment
|
@pytorchbot retest this please |
|
It seems that there are gloo changes. Could you try |
|
Whoops, yeah. There are also some failing tests -- I will debug and @mention you when ready. |
|
@pytorchbot retest this please |
1 similar comment
|
@pytorchbot retest this please |
|
@ssnl ready for another look! |
ssnl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one minor nit. Otherwise looks great!
test/test_nn.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
…") moving average
|
@pytorchbot retest this please |
ssnl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be merged if tests pass :) Thanks @jma127 !
|
@jma127 is the CI not working? Tests start automatically (within ~10 secs) every time you push new commits (i.e. Usually no need to use pytorchbot) -- just FYI |
|
Oh OK, good to know :P |
|
One test is giving a 404 right now so I'll restart again. |
|
@pytorchbot retest this please |
|
Hmm, now I get: Will retest one more time -- if it still persists, @goldsborough could you investigate? |
|
@pytorchbot retest this please |
|
Sure. Cc @ezyang (our CI wizard) |
|
@pytorchbot retest this please |
|
Let's merge it |
…simple") moving average" (#5892) * Revert "Port ATen and JIT C++ tests to Catch2 (#5788)" This reverts commit 6f80023. * Revert "Fix error message for cat-ing zero-dim tensors (#5819)" This reverts commit cf2e176. * Revert "Softmax symbolic should account for negative dim (#5846)" This reverts commit ba64724. * Revert "[fft][1 of 3] build system and helpers to support cuFFT and MKL (#5855)" This reverts commit 22ef8e5. * Revert "Don't modify requires_grad when running DataParallel in no_grad mode (#5880)" This reverts commit d11b7fb. * Revert "fix some methods not showing up in doc (#5882)" This reverts commit 24fca0e. * Revert "ReduceOps cleanup and set_num_threads (#5723)" This reverts commit 84400d5. * Revert "introduce shape_as_tensor and reshape_from_variable_shape (#5824)" This reverts commit f446b82. * Revert "Enable resetting of batchnorm running moments and cumulative ("simple") moving average (#5766)" This reverts commit 99b1f6c.
…simple") moving average" (pytorch#5892) * Revert "Port ATen and JIT C++ tests to Catch2 (pytorch#5788)" This reverts commit 6f80023. * Revert "Fix error message for cat-ing zero-dim tensors (pytorch#5819)" This reverts commit cf2e176. * Revert "Softmax symbolic should account for negative dim (pytorch#5846)" This reverts commit ba64724. * Revert "[fft][1 of 3] build system and helpers to support cuFFT and MKL (pytorch#5855)" This reverts commit 22ef8e5. * Revert "Don't modify requires_grad when running DataParallel in no_grad mode (pytorch#5880)" This reverts commit d11b7fb. * Revert "fix some methods not showing up in doc (pytorch#5882)" This reverts commit 24fca0e. * Revert "ReduceOps cleanup and set_num_threads (pytorch#5723)" This reverts commit 84400d5. * Revert "introduce shape_as_tensor and reshape_from_variable_shape (pytorch#5824)" This reverts commit f446b82. * Revert "Enable resetting of batchnorm running moments and cumulative ("simple") moving average (pytorch#5766)" This reverts commit 99b1f6c.
This is a proposed implementation of
reset_running_stats, which I discussed with @soumith earlier today. In addition, it enables the use of the simple moving average for running stats in place of the exponential moving average, with amomentum=Nonesetting.The motivation behind these changes is to faithfully execute the moment estimation of the original batchnorm paper (line 10 in Algorithm 2). To do this, you would do: