-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Extend Graph Export to NNC, extend script to support CPU #74076
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
CI Flow Status⚛️ CI FlowRuleset - Version:
|
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 90b0540 (more details on the Dr. CI page):
🕵️ 1 new failure recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
davidberard98
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks for adding the cpu timing!
Left a comment about whether we should print IR, otherwise looks good
| print(f" Graph {i}; baseline: {baseline:.2f} ms; nvfuser: {nvfuser:.2f} ms; improvement: {improvement:.2f}%") | ||
| except RuntimeError: | ||
| print(f" Graph {i} failed:", traceback.format_exc()) | ||
| print(f"Running Graph {ir}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - I thought it was nice having just the index, because it gets cluttered when we print the ir along with the performance results (especially with the nvfuser/torchbench results where there's ~2500 fusion groups). Plus it might be easier to export to csv if we're only printing indices.
But I'm fine with leaving in the IR if you think it's important
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure will update
|
@eellison Thanks for extending the script for NNC! Can you specify what output file it generates? I checked the PR out and tried it with a local test but did not see any outputs. Here's the log file for my local test: P486118155 |
|
@huiguoo it should have around the graph - I repro'd and got this working locally |
|
also - I think time.perf_counter() returns seconds, while the cuda timer measures in milliseconds. Might be good to make them all use the same units. |
Extends the repro script to cpu and NNC. As in file: Usage: ``` 1. Run your script and pipe into a log file PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt 2. Run log_extract: log_extract.py log.txt --baseline --nnc ``` [ghstack-poisoned]
Extends the repro script to cpu and NNC. As in file: Usage: ``` 1. Run your script and pipe into a log file PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt 2. Run log_extract: log_extract.py log.txt --baseline --nnc ``` [ghstack-poisoned]
Extends the repro script to cpu and NNC. As in file: Usage: ``` 1. Run your script and pipe into a log file PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt 2. Run log_extract: log_extract.py log.txt --baseline --nnc ``` [ghstack-poisoned]
|
@eellison has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Extends the repro script to cpu and NNC. As in file: Usage: ``` 1. Run your script and pipe into a log file PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt 2. Run log_extract: log_extract.py log.txt --baseline --nnc ``` Differential Revision: [D34946883](https://our.internmc.facebook.com/intern/diff/D34946883) [ghstack-poisoned]
|
@eellison has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Extends the repro script to cpu and NNC. As in file: Usage: ``` 1. Run your script and pipe into a log file PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt 2. Run log_extract: log_extract.py log.txt --baseline --nnc ``` Differential Revision: [D34946883](https://our.internmc.facebook.com/intern/diff/D34946883) [ghstack-poisoned]
Extends the repro script to cpu and NNC. As in file: Usage: ``` 1. Run your script and pipe into a log file PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt 2. Run log_extract: log_extract.py log.txt --baseline --nnc ``` Differential Revision: [D34946883](https://our.internmc.facebook.com/intern/diff/D34946883) [ghstack-poisoned]
|
@eellison has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Extends the repro script to cpu and NNC. As in file: Usage: ``` 1. Run your script and pipe into a log file PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt 2. Run log_extract: log_extract.py log.txt --baseline --nnc ``` Differential Revision: [D34946883](https://our.internmc.facebook.com/intern/diff/D34946883) [ghstack-poisoned]
|
@eellison has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Extends the repro script to cpu and NNC. As in file: Usage: ``` 1. Run your script and pipe into a log file PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt 2. Run log_extract: log_extract.py log.txt --baseline --nnc ``` Differential Revision: [D34946883](https://our.internmc.facebook.com/intern/diff/D34946883) [ghstack-poisoned]
|
@eellison has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Extends the repro script to cpu and NNC. As in file: Usage: ``` 1. Run your script and pipe into a log file PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt 2. Run log_extract: log_extract.py log.txt --baseline --nnc ``` Differential Revision: [D34946883](https://our.internmc.facebook.com/intern/diff/D34946883) [ghstack-poisoned]
|
@eellison has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Extends the repro script to cpu and NNC. As in file: Usage: ``` 1. Run your script and pipe into a log file PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt 2. Run log_extract: log_extract.py log.txt --baseline --nnc ``` Differential Revision: [D34946883](https://our.internmc.facebook.com/intern/diff/D34946883) [ghstack-poisoned]
|
@eellison has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Summary: Pull Request resolved: #74076 Extends the repro script to cpu and NNC. As in file: Usage: ``` 1. Run your script and pipe into a log file PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt 2. Run log_extract: log_extract.py log.txt --baseline --nnc ``` Test Plan: Imported from OSS Reviewed By: gchanan Differential Revision: D34946883 Pulled By: eellison fbshipit-source-id: 644012dbbca0b490820ef83e761c06b0dd009e52
|
Hey @eellison. |
ghstack-source-id: 2551ab5 Pull Request resolved: pytorch/pytorch#74076
Stack from ghstack:
Extends the repro script to cpu and NNC. As in file:
Usage:
Differential Revision: D34946883