-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[NCCL] Add option to run NCCL on high priority cuda stream #43796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This pull request was exported from Phabricator. Differential Revision: D23404286 |
💊 CI failures summary and remediationsAs of commit 6e6b5af (more details on the Dr. CI page):
🕵️ 1 new failure recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
mrshenli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @mingzhe09088, thanks for adding this. Could you please add some more description to the PR summary to explain the benefits of using a high priority stream?
|
This pull request was exported from Phabricator. Differential Revision: D23404286 |
Codecov Report
@@ Coverage Diff @@
## master #43796 +/- ##
=========================================
Coverage ? 69.25%
=========================================
Files ? 378
Lines ? 46862
Branches ? 0
=========================================
Hits ? 32452
Misses ? 14410
Partials ? 0 Continue to review full report at Codecov.
|
torch/csrc/distributed/c10d/init.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need some docs here for isHighPriority and opTimeout explaining what this means to users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will only be for power users. Not sure what's a good place to add the docs. Could you suggest a place for that?
|
This pull request was exported from Phabricator. Differential Revision: D23404286 |
|
This pull request was exported from Phabricator. Differential Revision: D23404286 |
|
This pull request was exported from Phabricator. Differential Revision: D23404286 |
Summary: Pull Request resolved: #43796 This diff adds an option for the process group NCCL backend to pick high priority cuda streams. Test Plan: waitforsandcastle Reviewed By: jiayisuse Differential Revision: D23404286 fbshipit-source-id: 412f8216678c74d932f8143040809108d03eda79
|
This pull request was exported from Phabricator. Differential Revision: D23404286 |
|
This pull request has been merged in 574f9af. |
Summary: Pull Request resolved: #43796 This diff adds an option for the process group NCCL backend to pick high priority cuda streams. Test Plan: waitforsandcastle Reviewed By: jiayisuse Differential Revision: D23404286 fbshipit-source-id: b79ae097b7cd945a26e8ba1dd13ad3147ac790eb
Summary: This diff adds an option for the process group NCCL backend to pick high priority cuda streams. It lets cuda driver to prioritize NCCL kernels when there are compute kernels waiting. Here is an explanation about high priority cuda streams: https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__STREAM.html#group__CUDART__STREAM_1ge2be9e9858849bf62ba4a8b66d1c3540
Test Plan: to add
Differential Revision: D23404286