-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[ready] Layer Normalization #4922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
torch/nn/modules/layernorm.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
Sorry but could you also fix the LRN docs with this PR? Just need to remove the extra colon here. |
f989db0 to
c68b7e5
Compare
142707f to
1f3df3d
Compare
45b8220 to
9beb186
Compare
|
@pytorchbot retest this please |
1 similar comment
|
@pytorchbot retest this please |
|
@pytorchbot retest this please |
| self.assertEqual(grad1, grad2) | ||
|
|
||
| # track_running_stats=False | ||
| module = nn.BatchNorm1d(3, track_running_stats=False).type(test_type) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| """Applies instance normalization over an input. The implementation is | ||
| based on batch_norm, in which we do reshape, batchnorm, and reshape again. | ||
| def batch_norm(input, running_mean, running_var, weight=None, bias=None, | ||
| training=False, momentum=0.1, eps=1e-5): |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/modules/batchnorm.py
Outdated
| Args: | ||
| num_features: num_features from an expected input of size | ||
| `[batch_size x num_features (x width)]` | ||
| :math:`(N, C, L)` or :math:`(N, L)` |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/modules/instancenorm.py
Outdated
| Args: | ||
| num_features: num_features from an expected input of size | ||
| `[batch_size x num_features x width]` | ||
| :math:`(N, C, L)` or :math:`(N, L)` |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
zou3519
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
|
The docs of Layernorm and LRN have these: http://pytorch.org/docs/master/nn.html?highlight=layernorm#torch.nn.LayerNorm.forward |
|
@ruotianluo definitely not. This phenomenon is not unique to these two classes through. I'll figure out what's wrong. |
|
Is there a particular reason that the
but why not
as one would with BatchNorm. From the paper, there doesn't seem to be any reason you wouldn't use Layer Normalization in this way. It's a minor change, but it may be a more seemless interface. |
|
@meder411 It doesn't need to be a tensor. It can be a list, tuple, The idea is to be able to normalize over multiple dimensions. Hence the argument is called |
|
@meder411 While the docs do look correct to me for describing |
| self.running_mean.zero_() | ||
| self.running_var.fill_(1) | ||
| if self.elementwise_affine: | ||
| self.weight.data.uniform_() |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
* at::maybe_data_ptr and Check.h => TensorUtils.h * THNN support for optional BN running_* * ATen support for optional BN running_* * Python nn.* support for optional BN running_*; Improve IN and BN doc * Add tests for IN and BN new option * Layer Norm * Fix LRN doc * functional interface for LN and IN * Layer norm tests * fix BN double backward returning undefined tensors * fix jit test using wrong dim inputs for BN * add/improve BN, IN and LN GPU tests with half type * Udpate docs to be consistent with Conv notation Fix onnx Clarified onnx symbokic wrapper * fix typo * Address comments
|
How can I use it with nn.LSTM or nn.GRU? it will be much more helpful if LN can be used with the two classes |
|
@JustinLin610 Using the |
|
@ssnl I don't think there's a way to modify the cell? The only way would be to copy and paste the code for the nn.GRUCell and then add layernorm manually. Would be nice if there's an option to easily enable it... |
|
I'd like to apply LayerNorm to multi-layer LSTM but have no idea how to use LSTMCell classes. @ssnl, could you let me know any simple example for this? LayerNorm is quite new to PyTorch so it is difficult to find a good example for this.. |
|
@ssnl I agree with jinserk. I've looked far and wide for a feature full example of LSTMs and GRUs that support layer normalisation including bidirectionality and multiple layers and I haven't really found anything. There are some incomplete LSTM versions around but they don't seem to be feature complete or optimised very well. |
|
@MBAnslow, @jinserk see https://github.com/pytorch/benchmark/blob/master/rnns/fastrnns/custom_lstms.py for some examples |
Commits:
ATen/Check.htoATen/TensorUtils.hwith an additional methodat::maybe_data_ptradded. Make UndefinedTensor's data_ptr() return nullptr #4851running_meanandrunning_varcan be optional whentraining=True. Feature request: instance norm without the computation of running statistics #4509InstanceNorm*d'suse_running_statsfrom Fix setting using running stats in InstanceNorm*d #4444 to the new optiontrack_running_statson BN. Improves IN and BN docs.