Skip to content

Conversation

@t-vi
Copy link
Collaborator

@t-vi t-vi commented Jul 8, 2020

When we return to Python from C++ in PyTorch and have warnings and and error, we have the problem of what to do when the warnings throw because we can only throw one error.
Previously, if we had an error, we punted all warnings to the C++ warning handler which would write them to stderr (i.e. system fid 2) or pass them on to glog.

This has drawbacks if an error happened:

  • Warnings are not handled through Python even if they don't raise,
  • warnings are always printed with no way to suppress this,
  • the printing bypasses sys.stderr, so Python modules wanting to
    modify this don't work (with the prominent example being Jupyter).

This patch does the following instead:

  • Set the warning using standard Python extension mechanisms,
  • if Python decides that this warning is an error and we have a
    PyTorch error, we print the warning through Python and clear
    the error state (from the warning).

This resolves the three drawbacks discussed above, in particular it fixes #37240 .

When we return to Python from C++ in PyTorch and have warnings
and and error, we have the problem of what to do when the warnings
throw because we can only throw one error.
Previously, if we had an error, we punted all warnings to the C++
warning handler which would write them to stderr (i.e. system fid 2)
or pass them on to glog.

This has drawbacks if an error happened:
- Warnings are not handled through Python even if they don't raise,
- warnings are always printed with no way to suppress this,
- the printing bypasses sys.stderr, so Python modules wanting to
  modify this don't work (with the prominent example being Jupyter).

This patch does the following instead:
- Set the warning using standard Python extension mechanisms,
- if Python decides that this warning is an error and we have a
  PyTorch error, we print the warning through Python and clear
  the error state (from the warning).

This resolves the three drawbacks discussed above, in particular
fixes pytorch#37240
@t-vi t-vi requested a review from albanD July 8, 2020 07:14
@dr-ci
Copy link

dr-ci bot commented Jul 8, 2020

💊 CI failures summary and remediations

As of commit 341cf30 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 24 times.

@t-vi
Copy link
Collaborator Author

t-vi commented Jul 8, 2020

This needs some more work before it's ready...

@t-vi
Copy link
Collaborator Author

t-vi commented Jul 8, 2020

@albanD now it's actually good to look at. I'll rebase.

@t-vi
Copy link
Collaborator Author

t-vi commented Jul 9, 2020

@kostmo In the CI analysis of the first few iterations of the patch, there was an error output from a passing test highlighted instead of the output of the failing tests. I know the build log looks funny and it seems to have confused the matching.
(ERROR:sccache::server: Compilation failed: Output... is spurious)

@t-vi
Copy link
Collaborator Author

t-vi commented Jul 9, 2020

@albanD I'm going to claim that the ROCm build error is unrelated. What do you think?

Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good thanks !

The rocm is green on master. So I guess flakyness?

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@albanD is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@albanD merged this pull request in a318234.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Warnings from anomaly mode get eaten up by Jupiter Notebook (including on Colab)

5 participants