-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Expose gradients w.r.t. input & weight for conv1d, conv2d, conv3d in Python #5408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
torch/nn/modules/conv.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
So, the reservation from Adam is valid, but not in the way he expressed it. I think the right design would be a separate lookup table that branches one step further. For example: This way, all this extra API is encapsulated in the |
|
|
|
@ezyang Yes makes sense. If I have understood correctly we should add all the gradient functions we will expose(now or in future) via a separate module called |
|
@vedanuj correct |
|
I thought about the namespace, and it's not a significantly better solution (the complexity remains the same), but I'm ok with it. One extra thing that we might want to think of before we start adding more and more functions there is that sometimes the derivative computations for a single op can share a lot of intermediates, and so we might need to redesign the API (to avoid having a function per computed gradient) |
torch/nn/gradients.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/functional.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/gradients.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/gradients.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/gradients.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/gradients.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
Needs tests. |
|
The last commit added tests for all the grad functions and addresses other comments. Tests:
|
|
@pytorchbot test this please |
test/test_nn.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
The commit 23a6b3a adds conv1d gradient w.r.t. input( |
|
@pytorchbot retest this please |
|
@pytorchbot retest this please |
apaszke
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I have some minor comments
torch/nn/functional.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/gradients.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/gradients.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
test/test_nn.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/gradients.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/gradients.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
New commits changes some names of parameters and removes use of deprecated @apaszke TensorFlow APIs for the similar methods requires to input the full shape for input size and weight size. Should we keep it same as TF or change input size and weight size to only specify the dimensions like height, width, depth? |
|
No, if we're going to accept the shape then it should be the full shape. Let's leave it as is. |
* upstream/master: (663 commits) Fix "command not found" error in perf test (pytorch#5982) add pip mkl-devel to the error message when mkl is found but mkl headers are not (pytorch#5984) Support batch LowerCholeskyTransform (pytorch#5980) Linearly interpolating upsampling fix (pytorch#5927) Store perf numbers in S3 (pytorch#5951) Modidy setup docs for Windows (pytorch#5981) Group Normalization (pytorch#5968) [distributions] Implement Power transform (pytorch#5976) Disable TestBottleneck test_cuda on Windows (pytorch#5977) Fix crash when cat-ing empty cuda tensors (pytorch#5971) Update no_unions flag for nanopb gen and update ONNX proto files (pytorch#5972) Expose gradients w.r.t. input & weight for conv1d, conv2d, conv3d in Python (pytorch#5408) Fixed non-determinate preprocessing on DataLoader (pytorch#4640) add AVX2 implementation for sigmoid function (pytorch#5010) Implement torch.util.bottleneck (pytorch#5216) Remove pragma once from cpp file (pytorch#5965) fix mvn docs (pytorch#5967) Fix incorrect rendering of Tensor.index_*_ doc examples. (pytorch#5969) Implement range for loop in script (pytorch#5827) Add windows doc (pytorch#5859) ... # Conflicts: # aten/src/TH/generic/THTensorMath.c # torch/_tensor_docs.py # torch/csrc/generic/methods/TensorCompare.cwrap
This PR addresses issue #5024 and exposes in python gradients with respect to input and weight for
conv2dandconv3d.Gradients with respect to input of convolution (
F.grad.conv2d_input,F.grad.conv3d_input) :For this
conv_transpose2d/conv_transpose3dare used which are essentially the gradients ofconv2d/conv3dw.r.t. input. The output_padding required forconv_transpose2d/conv_transpose3dare implicitly calculated from shape of input gradient, shape of output gradient, kernel size, stride and padding.Gradients with respect to weights (
F.grad.conv2d_weight,F.grad.conv3d_weight):For gradients with respect to the weight we use the
conv2d/conv3dbut with strides and dilations swapped. We reshape the output gradient and input in such a way that the output gradient can be treated as the weight of the forward convolution.Please review @ezyang