Skip to content

Conversation

@vedanuj
Copy link
Contributor

@vedanuj vedanuj commented Feb 26, 2018

This PR addresses issue #5024 and exposes in python gradients with respect to input and weight for conv2d and conv3d.

Gradients with respect to input of convolution (F.grad.conv2d_input , F.grad.conv3d_input) :

For this conv_transpose2d/conv_transpose3d are used which are essentially the gradients of conv2d/conv3d w.r.t. input. The output_padding required for conv_transpose2d/conv_transpose3d are implicitly calculated from shape of input gradient, shape of output gradient, kernel size, stride and padding.

Gradients with respect to weights (F.grad.conv2d_weight , F.grad.conv3d_weight):

For gradients with respect to the weight we use the conv2d/conv3d but with strides and dilations swapped. We reshape the output gradient and input in such a way that the output gradient can be treated as the weight of the forward convolution.

Please review @ezyang

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@soumith
Copy link
Contributor

soumith commented Feb 27, 2018

So, the reservation from Adam is valid, but not in the way he expressed it.
What we want is to not increase the API surface to accommodate each of these "backward" calls.
Imagine having:

F.conv2d
F.conv2d_backward_input
F.conv2d_backward_weight
F.conv2d_backward_bias

I think the right design would be a separate lookup table that branches one step further. For example:

F.grad.conv2d_input
F.grad.conv2d_weight
F.grad.conv2d_bias

This way, all this extra API is encapsulated in the grad tree.

@ezyang
Copy link
Contributor

ezyang commented Feb 28, 2018

grad is just a module, is that right? SGTM. @vedanuj, does this make sense?

@vedanuj
Copy link
Contributor Author

vedanuj commented Feb 28, 2018

@ezyang Yes makes sense. If I have understood correctly we should add all the gradient functions we will expose(now or in future) via a separate module called grad and add grad to functional.py so that we can enable something like F.grad.conv2d_input etc. Does that sound correct?

@soumith
Copy link
Contributor

soumith commented Feb 28, 2018

@vedanuj correct

@apaszke
Copy link
Contributor

apaszke commented Feb 28, 2018

I thought about the namespace, and it's not a significantly better solution (the complexity remains the same), but I'm ok with it. One extra thing that we might want to think of before we start adding more and more functions there is that sometimes the derivative computations for a single op can share a lot of intermediates, and so we might need to redesign the API (to avoid having a function per computed gradient)

@ezyang ezyang added the awaiting response (this tag is deprecated) This tag is deprecated while we figure out what to do with it label Mar 1, 2018
@vedanuj vedanuj changed the title [WIP] Expose Conv2dBackward in python [WIP] Expose gradients w.r.t. input/weight for conv2d, conv3d in Python Mar 6, 2018

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@ezyang
Copy link
Contributor

ezyang commented Mar 6, 2018

Needs tests.

@vedanuj
Copy link
Contributor Author

vedanuj commented Mar 11, 2018

The last commit added tests for all the grad functions and addresses other comments.

Tests:

python test/test_nn.py TestNN.test_grad_conv2d_input

python test/test_nn.py TestNN.test_grad_conv2d_weight

python test/test_nn.py TestNN.test_grad_conv3d_input

python test/test_nn.py TestNN.test_grad_conv3d_weight

@ezyang @apaszke Please re-review it.

@vedanuj vedanuj changed the title [WIP] Expose gradients w.r.t. input/weight for conv2d, conv3d in Python Expose gradients w.r.t. input/weight for conv2d, conv3d in Python Mar 11, 2018
@ezyang
Copy link
Contributor

ezyang commented Mar 12, 2018

@pytorchbot test this please

test/test_nn.py Outdated

This comment was marked as off-topic.

@vedanuj
Copy link
Contributor Author

vedanuj commented Mar 14, 2018

The commit 23a6b3a adds conv1d gradient w.r.t. input(F.grad.conv1d_input) and weight(F.grad.conv1d_weight) and refactors tests for all the conv grad functions.

@vedanuj
Copy link
Contributor Author

vedanuj commented Mar 14, 2018

@pytorchbot retest this please

@vedanuj vedanuj changed the title Expose gradients w.r.t. input/weight for conv2d, conv3d in Python Expose gradients w.r.t. input & weight for conv1d, conv2d, conv3d in Python Mar 14, 2018
@ezyang
Copy link
Contributor

ezyang commented Mar 15, 2018

@pytorchbot retest this please

@vedanuj
Copy link
Contributor Author

vedanuj commented Mar 19, 2018

Please review @ezyang @apaszke

Copy link
Contributor

@apaszke apaszke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I have some minor comments

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

test/test_nn.py Outdated

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@vedanuj
Copy link
Contributor Author

vedanuj commented Mar 23, 2018

New commits changes some names of parameters and removes use of deprecated Variable.

@apaszke TensorFlow APIs for the similar methods requires to input the full shape for input size and weight size. Should we keep it same as TF or change input size and weight size to only specify the dimensions like height, width, depth?

@apaszke
Copy link
Contributor

apaszke commented Mar 23, 2018

No, if we're going to accept the shape then it should be the full shape. Let's leave it as is.

@ezyang ezyang merged commit f3e16cc into pytorch:master Mar 23, 2018
@vedanuj vedanuj deleted the conv_2d_backward branch March 24, 2018 06:19
sighingnow added a commit to sighingnow/pytorch that referenced this pull request Mar 25, 2018
* upstream/master: (663 commits)
  Fix "command not found" error in perf test (pytorch#5982)
  add pip mkl-devel to the error message when mkl is found but mkl headers are not (pytorch#5984)
  Support batch LowerCholeskyTransform (pytorch#5980)
  Linearly interpolating upsampling fix (pytorch#5927)
  Store perf numbers in S3 (pytorch#5951)
  Modidy setup docs for Windows (pytorch#5981)
  Group Normalization (pytorch#5968)
  [distributions] Implement Power transform (pytorch#5976)
  Disable TestBottleneck test_cuda on Windows (pytorch#5977)
  Fix crash when cat-ing empty cuda tensors (pytorch#5971)
  Update no_unions flag for nanopb gen and update ONNX proto files (pytorch#5972)
  Expose gradients w.r.t. input & weight for conv1d, conv2d, conv3d in Python (pytorch#5408)
  Fixed non-determinate preprocessing on DataLoader (pytorch#4640)
  add AVX2 implementation for sigmoid function (pytorch#5010)
  Implement torch.util.bottleneck (pytorch#5216)
  Remove pragma once from cpp file (pytorch#5965)
  fix mvn docs (pytorch#5967)
  Fix incorrect rendering of Tensor.index_*_ doc examples. (pytorch#5969)
  Implement range for loop in script (pytorch#5827)
  Add windows doc (pytorch#5859)
  ...

# Conflicts:
#	aten/src/TH/generic/THTensorMath.c
#	torch/_tensor_docs.py
#	torch/csrc/generic/methods/TensorCompare.cwrap
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting response (this tag is deprecated) This tag is deprecated while we figure out what to do with it

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants