Skip to content

Conversation

@osalpekar
Copy link
Member

@osalpekar osalpekar commented Jun 25, 2020

Stack from ghstack:

This PR aborts incomplete NCCL Communicators in the ProcessGroupNCCL
destructor. This should prevent pending NCCL communicators from blocking other CUDA ops.

Differential Revision: D22244873

…Destruction

This PR aborts incomplete NCCL Communicators in the ProcessGroupNCCL
destructor. This should prevent pending NCCL communicators from blocking other CUDA ops.

Differential Revision: [D22244873](https://our.internmc.facebook.com/intern/diff/D22244873/)

[ghstack-poisoned]
osalpekar added a commit that referenced this pull request Jun 25, 2020
…Destruction

This PR aborts incomplete NCCL Communicators in the ProcessGroupNCCL
destructor. This should prevent pending NCCL communicators from blocking other CUDA ops.

Differential Revision: [D22244873](https://our.internmc.facebook.com/intern/diff/D22244873/)

ghstack-source-id: 106633077
Pull Request resolved: #40585
@osalpekar osalpekar requested a review from jiayisuse June 25, 2020 22:44
…cess Group Destruction"

This PR aborts incomplete NCCL Communicators in the ProcessGroupNCCL
destructor. This should prevent pending NCCL communicators from blocking other CUDA ops.

Differential Revision: [D22244873](https://our.internmc.facebook.com/intern/diff/D22244873/)

[ghstack-poisoned]
osalpekar added a commit that referenced this pull request Jul 1, 2020
…Destruction

Pull Request resolved: #40585

This PR aborts incomplete NCCL Communicators in the ProcessGroupNCCL
destructor. This should prevent pending NCCL communicators from blocking other CUDA ops.
ghstack-source-id: 106988073

Differential Revision: [D22244873](https://our.internmc.facebook.com/intern/diff/D22244873/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D22244873/)!
@dr-ci
Copy link

dr-ci bot commented Jul 1, 2020

💊 CI failures summary and remediations

As of commit c937cab (more details on the Dr. CI page):


None of the CI failures appear to be your fault 💚



❄️ 1 failure tentatively classified as flaky

but reruns have not yet been triggered to confirm:

See CircleCI build caffe2_onnx_ort2_py3_6_clang7_ubuntu16_04_test (1/1)

Step: "Set Up CI Environment After attach_workspace" (full log | diagnosis details | 🔁 rerun) ❄️

gpg: no valid OpenPGP data found.
+ curl -s -L --retry 3 https://nvidia.github.io/nvidia-docker/gpgkey 
+ sudo apt-key add - 
gpg: no valid OpenPGP data found. 

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 1 time.

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 49e12d8.

@facebook-github-bot facebook-github-bot deleted the gh/osalpekar/47/head branch July 5, 2020 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants