Skip to content

Conversation

@gchanan
Copy link
Contributor

@gchanan gchanan commented Sep 9, 2020

Stack from ghstack:

MSELoss had a completely different (and incorrect, see #43228) path when target.requires_grad was True.

This PR does the following:

  1. adds derivative support for target via the normal derivatives.yaml route
  2. kill the different (and incorrect) path for when target.requires_grad was True
  3. modify the MSELoss CriterionTests to verify that the target derivative is checked.

TODO:

  1. do we still need check_criterion_jacobian when we run grad/gradgrad checks?
  2. ensure the Module tests check when target.requires_grad
  3. do we actually test when reduction='none' and reduction='mean'?

Differential Revision: D23612166

MSELoss had a completely different (and incorrect, see #43228) path when target.requires_grad was True.

This PR does the following:
1) adds derivative support for target via the normal derivatives.yaml route
2) kill the different (and incorrect) path for when target.requires_grad was True
3) modify the MSELoss CriterionTests to verify that the target derivative is checked.

TODO:
1) do we still need check_criterion_jacobian when we run grad/gradgrad checks?
2) ensure the Module tests check when target.requires_grad
3) do we actually test when reduction='none' and reduction='mean'?

[ghstack-poisoned]
@gchanan gchanan requested a review from apaszke as a code owner September 9, 2020 22:38
gchanan added a commit that referenced this pull request Sep 9, 2020
MSELoss had a completely different (and incorrect, see #43228) path when target.requires_grad was True.

This PR does the following:
1) adds derivative support for target via the normal derivatives.yaml route
2) kill the different (and incorrect) path for when target.requires_grad was True
3) modify the MSELoss CriterionTests to verify that the target derivative is checked.

TODO:
1) do we still need check_criterion_jacobian when we run grad/gradgrad checks?
2) ensure the Module tests check when target.requires_grad
3) do we actually test when reduction='none' and reduction='mean'?

ghstack-source-id: cbfba5f
Pull Request resolved: #44437
@dr-ci
Copy link

dr-ci bot commented Sep 9, 2020

💊 CI failures summary and remediations

As of commit c472bb2 (more details on the Dr. CI page):


  • 1/1 failures possibly* introduced in this PR
    • 1/1 non-CircleCI failure(s)

1 failure confirmed as flaky and can be ignored:

  • pytorch_bazel_test

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 8 times.

@gchanan
Copy link
Contributor Author

gchanan commented Sep 10, 2020

also need to fix the cpp API.
Look athttps://github.com/pytorch/pytorch/blob/af9cad761ad1dd215470ebce58932b82d6ba8bee/test/test_nn.py#L9787 at detached cases here.

@codecov
Copy link

codecov bot commented Sep 10, 2020

Codecov Report

Merging #44437 into gh/gchanan/322/base will decrease coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@                   Coverage Diff                   @@
##           gh/gchanan/322/base   #44437      +/-   ##
=======================================================
- Coverage                68.00%   67.99%   -0.01%     
=======================================================
  Files                      382      382              
  Lines                    49379    49373       -6     
=======================================================
- Hits                     33578    33572       -6     
  Misses                   15801    15801              
Impacted Files Coverage Δ
torch/nn/functional.py 92.18% <100.00%> (-0.04%) ⬇️
torch/testing/_internal/common_nn.py 83.06% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4f72e7c...c472bb2. Read the comment docs.

MSELoss had a completely different (and incorrect, see #43228) path when target.requires_grad was True.

This PR does the following:
1) adds derivative support for target via the normal derivatives.yaml route
2) kill the different (and incorrect) path for when target.requires_grad was True
3) modify the MSELoss CriterionTests to verify that the target derivative is checked.

TODO:
1) do we still need check_criterion_jacobian when we run grad/gradgrad checks?
2) ensure the Module tests check when target.requires_grad
3) do we actually test when reduction='none' and reduction='mean'?

Differential Revision: [D23612166](https://our.internmc.facebook.com/intern/diff/D23612166)

[ghstack-poisoned]
gchanan added a commit that referenced this pull request Sep 10, 2020
MSELoss had a completely different (and incorrect, see #43228) path when target.requires_grad was True.

This PR does the following:
1) adds derivative support for target via the normal derivatives.yaml route
2) kill the different (and incorrect) path for when target.requires_grad was True
3) modify the MSELoss CriterionTests to verify that the target derivative is checked.

TODO:
1) do we still need check_criterion_jacobian when we run grad/gradgrad checks?
2) ensure the Module tests check when target.requires_grad
3) do we actually test when reduction='none' and reduction='mean'?

ghstack-source-id: 88dab44
Pull Request resolved: #44437
@gchanan gchanan requested a review from albanD September 10, 2020 16:57
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm
Not sure why we didn't do that before... :D

# currently compute the gradient w.r.t. target for loss functions.
gradcheck(apply_fn, inputs)
if target.requires_grad:
inputs = inputs + (target,)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this works but it is a bit fragile.
Your apply_fn above actually ignores the target that is given in *params. But it works because it captures the same target from the parent scope. And the gradcheck ensures that you can do that. But I think it would be clearer to just get the target from the apply_fn inputs properly no?

@facebook-github-bot
Copy link
Contributor

@gchanan merged this pull request in d07d25a.

@facebook-github-bot facebook-github-bot deleted the gh/gchanan/322/head branch September 15, 2020 14:16
xuzhao9 pushed a commit that referenced this pull request Sep 18, 2020
Summary:
Pull Request resolved: #44437

MSELoss had a completely different (and incorrect, see #43228) path when target.requires_grad was True.

This PR does the following:
1) adds derivative support for target via the normal derivatives.yaml route
2) kill the different (and incorrect) path for when target.requires_grad was True
3) modify the MSELoss CriterionTests to verify that the target derivative is checked.

TODO:
1) do we still need check_criterion_jacobian when we run grad/gradgrad checks?
2) ensure the Module tests check when target.requires_grad
3) do we actually test when reduction='none' and reduction='mean'?

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D23612166

Pulled By: gchanan

fbshipit-source-id: 4f74d38d8a81063c74e002e07fbb7837b2172a10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants