-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[PyTorch] [gradcheck] change backward() to grad() #7710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… Replace unnecesary create_graph=True with retain_graph=True
|
The current gradgradcheck is broken because the inputs, containing >>> # print numerical and analytical grad wrt the grad_output
>>> torch.autograd.gradgradcheck(lambda x: x, [torch.randn(3).requires_grad_()], gen_non_contig_grad_outputs=False)
(tensor([[ 1.0133, 0.0000, 0.0000],
[ 0.0000, 1.0133, 0.0000],
[ 0.0000, 0.0000, 1.0133]]), tensor([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]]))
>>> torch.autograd.gradgradcheck(lambda x: x, [torch.randn(3).requires_grad_()], gen_non_contig_grad_outputs=True)
(tensor([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]]), tensor([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]]))Also fixing this. |
torch/autograd/gradcheck.py
Outdated
| input = contiguous(input) | ||
| target = contiguous(target) | ||
| if not all(t.is_contiguous() for t in iter_tensors(target)): | ||
| raise RuntimeError("target must only contain contiguous Tensors") |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/autograd/gradcheck.py
Outdated
| input_args = tuple(x for x in input_args if isinstance(x, torch.Tensor) and x.requires_grad) | ||
| grad_inputs = torch.autograd.grad(outputs, input_args, grad_outputs, create_graph=True) | ||
| grad_inputs = torch.autograd.grad(outputs, input_args, grad_outputs, | ||
| create_graph=True, only_inputs=True, allow_unused=True) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/autograd/gradcheck.py
Outdated
| 'although analytical gradient matches numerical gradient') | ||
|
|
||
| # check if the backward multiplies by grad_output | ||
| zero_gradients(inputs) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
I just got rid of the contiguity enforcement altogether. Hope that everything still works. |
| ggO, | ||
| ggW_maybe_squeeze.expand_as(gO) * gO * nonpositive_mask, | ||
| (ggI * gO * nonpositive_mask).sum() | ||
| (ggI * gO * nonpositive_mask).sum().expand_as(weight) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
@apaszke This is ready for review. |
apaszke
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. One concern about gradgradcheck that would be good to clarify before merging.
| ggO, | ||
| ggW_maybe_squeeze.expand_as(gO) * gO * nonpositive_mask, | ||
| (ggI * gO * nonpositive_mask).sum() | ||
| (ggI * gO * nonpositive_mask).sum().expand_as(weight) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| # need data here to get around the version check because without .data, | ||
| # the following code updates version but doesn't change content | ||
| x_tensor = x_tensor.data | ||
| for d_idx, x_idx in enumerate(product(*[range(m) for m in x_tensor.size()])): |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/autograd/gradcheck.py
Outdated
| var.requires_grad = True | ||
| return var | ||
| y = torch.testing.make_non_contiguous(y).requires_grad_() | ||
| return y |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
|
||
| def new_func(*args): | ||
| input_args = args[:-num_outputs] | ||
| grad_outputs = args[-num_outputs:] |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
…e2_core_hip * 'caffe2_core_hip' of github.com:petrex/pytorch: (24 commits) Allow empty storage for the 'Edge' class. (pytorch#7595) Process group base class and Gloo implementation (pytorch#7628) _LRSchedulers getstate include optimizer info (pytorch#7757) [PyTorch] [gradcheck] change backward() to grad() (pytorch#7710) Update test_nn.py (pytorch#7787) Define general default scheduler for TBB and fix ppc64le bug (pytorch#7761) Add support for accepting Tensor as input in clip_grad_* functions. (pytorch#7769) [Easy] Remove unused code (pytorch#7782) Update tbb (pytorch#7734) Add @generated annotation (pytorch#7780) fix legacy comment after variable tensor merge (pytorch#7771) Revert pytorch#7750 and pytorch#7762 to fix Windows CI on master (pytorch#7772) Temporarily disable build env check (pytorch#7768) Add missing brace (pytorch#7762) [C++ API] Add backward() to Tensor and Variable (pytorch#7750) [auto] Update onnx to d43b550 - Fix .gitignore and add missing files (onnx/onnx#1005) onnx/onnx@d43b550 [auto] Update onnx to ea1aa13 - add tests for reduce ops (onnx/onnx#675) onnx/onnx@ea1aa13 include cudnn_h (pytorch#7749) [C++ API] Using new registration mechanism (pytorch#7663) [auto] Update onnx to 5dd68e6 - Add a util function: polish_model (onnx/onnx#1000) onnx/onnx@5dd68e6 ...
* Change backward calls to grad to avoid memory leak from pytorch#7343; Replace unnecesary create_graph=True with retain_graph=True * fix gradgradcheck use of make_non_contiguous * allow non-contguous target * remove unnecessray .grad.zero_() * remove contiguous_detach * fix PReLU double backward always returning ggW as a scalar * let noncontig gO require grad * move requires_grad to return
Change backward calls to grad to avoid memory leak from #7343
Replace unnecessary create_graph=True with retain_graph=True
The memory leak is blocking #7270 .