[PyTorch] [gradcheck] change backward() to grad() #7710

ssnl · 2018-05-20T05:04:55Z

Change backward calls to grad to avoid memory leak from #7343
Replace unnecessary create_graph=True with retain_graph=True

The memory leak is blocking #7270 .

… Replace unnecesary create_graph=True with retain_graph=True

ssnl · 2018-05-20T06:14:53Z

The current gradgradcheck is broken because the inputs, containing grad_outputs, is not actually used in the function. Then when it is non-contiguous and gradcheck makes it contiguous, things break.

>>> # print numerical and analytical grad wrt the grad_output
>>> torch.autograd.gradgradcheck(lambda x: x, [torch.randn(3).requires_grad_()], gen_non_contig_grad_outputs=False)
(tensor([[ 1.0133,  0.0000,  0.0000],
        [ 0.0000,  1.0133,  0.0000],
        [ 0.0000,  0.0000,  1.0133]]), tensor([[ 1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.]]))

>>> torch.autograd.gradgradcheck(lambda x: x, [torch.randn(3).requires_grad_()], gen_non_contig_grad_outputs=True)
(tensor([[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]]), tensor([[ 0.,  0.,  0.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]]))

Also fixing this.

torch/autograd/gradcheck.py

    input = contiguous(input)
-    target = contiguous(target)
+    if not all(t.is_contiguous() for t in iter_tensors(target)):
+        raise RuntimeError("target must only contain contiguous Tensors")


torch/autograd/gradcheck.py

        input_args = tuple(x for x in input_args if isinstance(x, torch.Tensor) and x.requires_grad)
-        grad_inputs = torch.autograd.grad(outputs, input_args, grad_outputs, create_graph=True)
+        grad_inputs = torch.autograd.grad(outputs, input_args, grad_outputs,
+                                          create_graph=True, only_inputs=True, allow_unused=True)


torch/autograd/gradcheck.py

                             'although analytical gradient matches numerical gradient')

    # check if the backward multiplies by grad_output
    zero_gradients(inputs)


ssnl · 2018-05-21T19:37:13Z

I just got rid of the contiguity enforcement altogether. Hope that everything still works.

tools/autograd/templates/Functions.cpp

                ggO,
                ggW_maybe_squeeze.expand_as(gO) * gO * nonpositive_mask,
-                (ggI * gO * nonpositive_mask).sum()
+                (ggI * gO * nonpositive_mask).sum().expand_as(weight)


ssnl · 2018-05-21T22:29:18Z

@apaszke This is ready for review.

apaszke

LGTM. One concern about gradgradcheck that would be good to clarify before merging.

tools/autograd/templates/Functions.cpp

                ggO,
                ggW_maybe_squeeze.expand_as(gO) * gO * nonpositive_mask,
-                (ggI * gO * nonpositive_mask).sum()
+                (ggI * gO * nonpositive_mask).sum().expand_as(weight)


torch/autograd/gradcheck.py

+        # need data here to get around the version check because without .data,
+        # the following code updates version but doesn't change content
+        x_tensor = x_tensor.data
+        for d_idx, x_idx in enumerate(product(*[range(m) for m in x_tensor.size()])):


torch/autograd/gradcheck.py

-            var.requires_grad = True
-            return var
+                y = torch.testing.make_non_contiguous(y).requires_grad_()
+            return y


torch/autograd/gradcheck.py

+
+    def new_func(*args):
+        input_args = args[:-num_outputs]
+        grad_outputs = args[-num_outputs:]


@generated

…e2_core_hip * 'caffe2_core_hip' of github.com:petrex/pytorch: (24 commits) Allow empty storage for the 'Edge' class. (pytorch#7595) Process group base class and Gloo implementation (pytorch#7628) _LRSchedulers getstate include optimizer info (pytorch#7757) [PyTorch] [gradcheck] change backward() to grad() (pytorch#7710) Update test_nn.py (pytorch#7787) Define general default scheduler for TBB and fix ppc64le bug (pytorch#7761) Add support for accepting Tensor as input in clip_grad_* functions. (pytorch#7769) [Easy] Remove unused code (pytorch#7782) Update tbb (pytorch#7734) Add @generated annotation (pytorch#7780) fix legacy comment after variable tensor merge (pytorch#7771) Revert pytorch#7750 and pytorch#7762 to fix Windows CI on master (pytorch#7772) Temporarily disable build env check (pytorch#7768) Add missing brace (pytorch#7762) [C++ API] Add backward() to Tensor and Variable (pytorch#7750) [auto] Update onnx to d43b550 - Fix .gitignore and add missing files (onnx/onnx#1005) onnx/onnx@d43b550 [auto] Update onnx to ea1aa13 - add tests for reduce ops (onnx/onnx#675) onnx/onnx@ea1aa13 include cudnn_h (pytorch#7749) [C++ API] Using new registration mechanism (pytorch#7663) [auto] Update onnx to 5dd68e6 - Add a util function: polish_model (onnx/onnx#1000) onnx/onnx@5dd68e6 ...

* Change backward calls to grad to avoid memory leak from pytorch#7343; Replace unnecesary create_graph=True with retain_graph=True * fix gradgradcheck use of make_non_contiguous * allow non-contguous target * remove unnecessray .grad.zero_() * remove contiguous_detach * fix PReLU double backward always returning ggW as a scalar * let noncontig gO require grad * move requires_grad to return

Change backward calls to grad to avoid memory leak from pytorch#7343;…

2c26aa5

… Replace unnecesary create_graph=True with retain_graph=True

ssnl requested review from apaszke, colesbury, ezyang, gchanan, soumith and zdevito as code owners May 20, 2018 05:04

onnxbot-worker-1 mentioned this pull request May 20, 2018

[auto] pytorch-pr-7710 onnxbot/onnx-fb-universe#2150

Open

apaszke reviewed May 20, 2018

View reviewed changes

ssnl added 2 commits May 21, 2018 12:28

fix gradgradcheck use of make_non_contiguous

06a75b8

allow non-contguous target

46a43c1

ssnl added 3 commits May 21, 2018 15:53

remove unnecessray .grad.zero_()

9037c50

remove contiguous_detach

0ac1ccd

fix PReLU double backward always returning ggW as a scalar

1eb690b

ssnl commented May 21, 2018

View reviewed changes

let noncontig gO require grad

bae9caa

apaszke approved these changes May 22, 2018

View reviewed changes

move requires_grad to return

2ba2b50

apaszke approved these changes May 23, 2018

View reviewed changes

ssnl merged commit e3e15b5 into pytorch:master May 23, 2018

ssnl deleted the improve_gradcheck branch May 23, 2018 15:03

ezyang added the open source label Jun 24, 2019

[PyTorch] [gradcheck] change backward() to grad() #7710

[PyTorch] [gradcheck] change backward() to grad() #7710

Uh oh!

Conversation

ssnl commented May 20, 2018

Uh oh!

ssnl commented May 20, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

ssnl commented May 21, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

ssnl commented May 21, 2018

Uh oh!

apaszke left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ssnl commented May 21, 2018 •

edited

Loading