Make ProcessGroupNCCL load torch_ucc.so when TORCH_UCC_LIBRARY_PATH is set #69552

zasdfgbnm · 2021-12-07T19:32:00Z

This is the very first step for the UCC-NCCL integration. This PR lets ProcessGroupNCCL load the torch_ucc.so if the user specifies an environmental variable TORCH_UCC_LIBRARY_PATH. If this environment variable is not specified by the user, then there will be no visible change.

In the future, we may want to make PyTorch smart enough to automatically detect the torch_ucc.so in the user's system, but before doing that, I believe we should first make sure that ProcessGroupUCC is very well tested.

Note that in this PR, ProcessGroupNCCL just loads the library but will not use it. I am trying to make PRs small, so the usage of torch_ucc.so will be submitted in later PRs.

This PR requires the change in facebookresearch/torch_ucc#56, otherwise torch_ucc.so can not be successfully loaded. But his PR can be landed separately without waiting for facebookresearch/torch_ucc#56 because, in PyTorch's unit tests, UCC is never used or tested.

cc @pietern @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @SciPioneer @H-Huang

…s set

facebook-github-bot · 2021-12-07T19:32:06Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/69552
📄 Preview docs built from this PR
📄 Preview C++ docs built from this PR
🔧 Opt-in to CIFlow to control what jobs run on your PRs

💊 CI failures summary and remediations

As of commit 0e710a7 (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

pytorch-probot · 2021-12-07T19:33:12Z

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/214d33a509349431f4e80b4720c34377687fba18/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default

Workflows	Labels (bold enabled)	Status
Triggered Workflows
linux-bionic-py3.7-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/noarch`, `ciflow/trunk`	✅ triggered
linux-docs	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/docs`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-vulkan-bionic-py3.7-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`, `ciflow/vulkan`	✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test	`ciflow/all`, `ciflow/bazel`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3-clang5-mobile-build	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-clang7-asan	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/sanitizers`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-clang7-onnx	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/onnx`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc7	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc7-no-ops	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
win-vs2019-cpu-py3	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/trunk`, `ciflow/win`	✅ triggered
win-vs2019-cuda11.3-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/trunk`, `ciflow/win`	✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
docker-builds	`ciflow/all`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64-custom-ops	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64-full-jit	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64-metal	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-x86-64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-x86-64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-x86-64-full-jit	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
linux-binary-conda	`ciflow/binaries`, `ciflow/binaries/conda`	🚫 skipped
linux-binary-libtorch-cxx11-abi	`ciflow/binaries`, `ciflow/binaries/libtorch`	🚫 skipped
linux-binary-libtorch-pre-cxx11	`ciflow/binaries`, `ciflow/binaries/libtorch`	🚫 skipped
linux-binary-manywheel	`ciflow/binaries`, `ciflow/binaries/wheel`	🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/slow`, `ciflow/trunk`	🚫 skipped
linux-bionic-py3.6-clang9	`ciflow/xla`	🚫 skipped
linux-docs-push	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
macos-10-15-py3-arm64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
macos-11-py3-x86-64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`, `ciflow/slow`, `ciflow/slow-gradcheck`	🚫 skipped
periodic-linux-xenial-cuda11.1-py3.7-gcc7-debug	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-win-vs2019-cuda11.1-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/scheduled`, `ciflow/win`	🚫 skipped
periodic-win-vs2019-cuda11.5-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/scheduled`, `ciflow/win`	🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped

You can add a comment to the PR and tag @pytorchbot with the following commands:

# ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun

# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow

For more information, please take a look at the CI Flow Wiki.

pritamdamania87

But his PR can be landed separately without waiting for facebookresearch/torch_ucc#56 because, in PyTorch's unit tests, UCC is never used or tested.

I do feel if we add some functionality to PyTorch, there should be tests covering this functionality. What would it take to add some tests for the UCC functionality?

torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp

zasdfgbnm · 2021-12-08T20:12:43Z

@pritamdamania87 I agree that when we add new functionality, there should be test. Actually, the majority part of my work is to incrementally enable most torch.distributed tests on UCC. My plan is:

This is the first PR, it just adds the code that loads torch_ucc.so and does nothing more.
In the second PR, I will do a bit more: this time, ProcessGroupNCCL will create an instance of ProcessGroupUCC and store its pointer. But this pointer is not used.
In the third PR, I will make ProcessGroupNCCL dispatch to ProcessGroupUCC on CPU for send, recv, and recvAnysource, enable unit tests for them on CPU, and modify the CI pipelines to install torch_ucc.so so that these tests are run.
After the third PR, more functionality and tests will be incrementally enabled, and every PR will be tested.

With this plan, the first two PR will NOT be tested. Does this plan sound good to you? If you prefer, I can merge the first three PRs into one. The PR will be larger, and a bit difficult to review, but it ensures everything added to it is tested.

torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp

torch/csrc/distributed/c10d/UCCForNCCL.hpp

torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp

jiayisuse · 2021-12-09T01:56:11Z

torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp

+
+  static std::once_flag initialize_ucc_lib_flag;
+  std::call_once(initialize_ucc_lib_flag, [&]{
+    ucc_lib_ = loadTorchUCC();


Should we just load UCC when CUDA/GPU is not available?

UCC will be used to handle CPU tensors for ProcessGroupNCCL, so no, even if CUDA/GPU is available, UCC will still be useful.

One question about UCC dependence. Does OSS UCC depend on NCCL? If so, when UCC is loaded through torch_ucc, is it possible that NCCL will also be loaded? Will this end up with two different versions of NCCL being loaded into PyTorch?

Yes, UCC supports using NCCL (as well as UCX and other libraries) as its transport. But in our case, we are not using this functionality. When we use UCC, we should be using the UCX backend and when we want to use NCCL, we will use it directly.

For the question of having multiple versions of NCCL, my understanding is:

If both PyTorch and UCC are dynamically linked to NCCL

The documentation of dlopen[1] says:

Symbol references in the shared object are resolved using (in order): symbols in the link map of objects loaded for the main program and its dependencies; symbols in shared objects (and their dependencies) that were previously opened with dlopen() using the RTLD_GLOBAL flag; and definitions in the shared object itself (and any dependencies that were loaded for that object).

Because PyTorch is always loaded earlier than torch_ucc, by the time torch_ucc is loaded, symbols of NCCL should have already been available in the main program. So dlopen will just use these symbols for UCC.

If both PyTorch and UCC are statically linked to NCCL

For static linking, symbol resolution should happen at link time. So PyTorch will have its copy of NCCL code, and UCC will also have its copy. Each will use its own copy. The NCCL symbols in libucc.so will not be added to the main program for later symbol resolution because we dlopen it with RTLD_LOCAL, and according to the documentation[1]:

This is the converse of RTLD_GLOBAL, and the default if neither flag is specified. Symbols defined in this shared object are not made available to resolve references in subsequently loaded shared objects.

If PyTorch is statically linked to NCCL and UCC is dynamically linked to NCCL

For this case, PyTorch will use the NCCL that it is compiled with because symbol resolution happened at link time. I don't know which NCCL UCC will use, because I don't know if NCCL's symbol will be copied to PyTorch and added to the link map. If yes, then UCC will use that one from PyTorch. If not, then UCC will loads its own NCCL and use the symbol there.

If PyTorch is dynamically linked to NCCL and UCC is statically linked to NCCL

PyTorch is always loaded earlier than torch_ucc, so PyTorch will load NCCL from the system and use that NCCL. But UCC will still use its own NCCL because all the NCCL function calls are already resolved when libucc.so is generated.

Reference

[1] https://man7.org/linux/man-pages/man3/dlopen.3.html

mingzhe09088 · 2021-12-09T17:41:49Z

At high level, is there a plan to introduce a cmake flag to control whether UCC is on or off?

zasdfgbnm · 2021-12-10T00:53:41Z

I have resolved all review comments, including the cmake flag.

zasdfgbnm · 2022-01-07T19:25:08Z

@mingzhe09088 @jiayisuse @pritamdamania87 Does this PR and the RFC #70654 look good to you?

pritamdamania87 · 2022-01-10T20:37:20Z

CMakeLists.txt

 cmake_dependent_option(
    USE_C10D_NCCL "USE C10D NCCL" ON "USE_DISTRIBUTED;USE_NCCL" OFF)
+cmake_dependent_option(
+    USE_NCCL_WITH_UCC "Enable UCC support for ProcessGroupNCCL. Only available if USE_C10D_NCCL is on." ON


Do we really need this cmake flag if we are anyways only conditionally loading the ucc lib at runtime? What is the downside if we don't have this flag at all and default to true? Would there be some issues in the case where ucc lib is not found?

I added this based on #69552 (comment). Right now, it might seem a bit redundant, but in the future, when ProcessGroupNCCL starts to automatically search and load torch_ucc.so from the user's system, having a cmake flag will be helpful. In my opinion, if we are sure we will need one in the future, it won't be bad to have it now, but I can remove this flag if you want.

pritamdamania87 · 2022-01-10T20:41:42Z

torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp

+
+#ifdef USE_NCCL_WITH_UCC
+  static std::once_flag initialize_ucc_lib_flag;
+  std::call_once(initialize_ucc_lib_flag, [&]{


Why do we need to pass & into the lambda? uccLib_ is static and we shouldn't need & to access it in the lambda?

This is for rank_, otherwise we will have

/home/gaoxiang/pytorch-ucc2/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp: In lambda function: /home/gaoxiang/pytorch-ucc2/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:599:32: error: ‘this’ was not captured for this lambda function 599 | LOG(INFO) << "[Rank " << rank_ << "] torch_ucc.so loaded"; | ^~~~~ /home/gaoxiang/pytorch-ucc2/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:599:32: error: invalid use of non-static data member ‘c10d::ProcessGroup::rank_’ In file included from /home/gaoxiang/pytorch-ucc2/torch/csrc/distributed/c10d/ProcessGroupNCCL.hpp:13, from /home/gaoxiang/pytorch-ucc2/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1: /home/gaoxiang/pytorch-ucc2/torch/csrc/distributed/c10d/ProcessGroup.hpp:431:13: note: declared here 431 | const int rank_; | ^~~~~

pritamdamania87 · 2022-01-10T20:44:04Z

torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp

+    uccLib_ = loadTorchUCC();
+    if (uccLib_ != nullptr) {
+      LOG(INFO) << "[Rank " << rank_  << "] torch_ucc.so loaded";
+    }


nit: In an else clause should we log that torch_ucc.so was not loaded.

This is based on #69552 (comment)

zasdfgbnm · 2022-01-31T18:43:57Z

This PR should be ready for review. The only thing unsure is whether we should keep or remove the compile-time flag. There are suggestions, such as in #70654 (comment), that a compile flag is not needed. I don't have a strong preference whether or not to keep such a compile-time flag, it's totally up to Meta to decide :)

facebook-github-bot · 2022-03-07T00:38:28Z

@jiayisuse has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

zasdfgbnm · 2022-03-14T22:30:30Z

Any update @jiayisuse ?

jiayisuse · 2022-03-17T18:20:00Z

torch/csrc/distributed/c10d/UCCForNCCL.hpp

+
+namespace c10d {
+
+std::shared_ptr<at::DynamicLibrary> loadTorchUCC() {


Let's make this interface as inline? Otherwise it causes conflicting symbol in building time. Thanks!

Thanks! Imported and doing another round of tests

facebook-github-bot · 2022-03-21T18:07:42Z

@jiayisuse has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

…s set (#69552) Summary: This is the very first step for the UCC-NCCL integration. This PR lets `ProcessGroupNCCL` load the `torch_ucc.so` if the user specifies an environmental variable `TORCH_UCC_LIBRARY_PATH`. If this environment variable is not specified by the user, then there will be no visible change. In the future, we may want to make PyTorch smart enough to automatically detect the `torch_ucc.so` in the user's system, but before doing that, I believe we should first make sure that `ProcessGroupUCC` is very well tested. Note that in this PR, `ProcessGroupNCCL` just loads the library but will not use it. I am trying to make PRs small, so the usage of `torch_ucc.so` will be submitted in later PRs. This PR requires the change in facebookresearch/torch_ucc#56, otherwise `torch_ucc.so` can not be successfully loaded. But his PR can be landed separately without waiting for facebookresearch/torch_ucc#56 because, in PyTorch's unit tests, UCC is never used or tested. cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Pull Request resolved: #69552 Reviewed By: mruberry Differential Revision: D34675212 Pulled By: jiayisuse fbshipit-source-id: a3d1fb98340dbe3a931af555423863efd381f1ae

github-actions · 2022-03-25T18:20:17Z

Hey @zasdfgbnm.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

Summary: This PR is based on #69552, please review that PR first. This PR requires facebookresearch/torch_ucc#57, but this PR can be landed separately because in PyTorch's unit tests, UCC is never used or tested. cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Pull Request resolved: #69564 Reviewed By: mingzhe09088 Differential Revision: D35457276 Pulled By: jiayisuse fbshipit-source-id: 662c5a771c7cfc92ab42955c9abd24e56e5cafda

Summary: This PR is based on #69552, please review that PR first. This PR requires facebookresearch/torch_ucc#57, but this PR can be landed separately because in PyTorch's unit tests, UCC is never used or tested. cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Pull Request resolved: #69564 Reviewed By: mingzhe09088 Differential Revision: D35457276 Pulled By: jiayisuse fbshipit-source-id: 662c5a771c7cfc92ab42955c9abd24e56e5cafda (cherry picked from commit 985b5f8)

Make ProcessGroupNCCL load torch_ucc.so when TORCH_UCC_LIBRARY_PATH i…

0073912

…s set

facebook-github-bot added the cla signed label Dec 7, 2021

facebook-github-bot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Dec 7, 2021

save

b2f3d34

pytorch-probot bot added the ciflow/default label Dec 7, 2021

pytorchbot added the open source label Dec 7, 2021

save

7857b4f

This was referenced Dec 7, 2021

Create UCC ProcessGroup when ucc_lib available #69564

Closed

Split torch_ucc to torch_ucc and torch_ucc_oss facebookresearch/torch_ucc#56

Closed

zasdfgbnm marked this pull request as ready for review December 8, 2021 17:00

zasdfgbnm requested review from H-Huang, bowangbj, mingzhe09088, mrshenli, pritamdamania87, rohan-varma and zhaojuanmao as code owners December 8, 2021 17:00

pritamdamania87 reviewed Dec 8, 2021

View reviewed changes

torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp Outdated Show resolved Hide resolved

zasdfgbnm added 2 commits December 8, 2021 11:51

make ucc_lib static

acba3dd

fix

c268b60

better code style

c03b62b

mrshenli added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Dec 8, 2021

ucc_lib_

8201aa9

pritamdamania87 reviewed Dec 8, 2021

View reviewed changes

torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp Outdated Show resolved Hide resolved

call_once

5bca2c3

jiayisuse reviewed Dec 9, 2021

View reviewed changes

Merge branch 'master' of github.com:pytorch/pytorch into load-ucc.so

bb244a8

zasdfgbnm added 3 commits December 10, 2021 09:51

save

3214761

Merge branch 'master' of github.com:pytorch/pytorch into load-ucc.so

cbc4859

Merge branch 'master' of github.com:pytorch/pytorch into load-ucc.so

ef97f88

zasdfgbnm mentioned this pull request Jan 5, 2022

[RFC] UCC integration in ProcessGroupNCCL #70654

Open

Merge branch 'master' of github.com:pytorch/pytorch into load-ucc.so

abf1f1f

pritamdamania87 reviewed Jan 10, 2022

View reviewed changes

zasdfgbnm added 3 commits January 14, 2022 13:14

Merge branch 'master' of github.com:pytorch/pytorch into load-ucc.so

214d33a

Merge branch 'master' of github.com:pytorch/pytorch into load-ucc.so

7c075ed

USE_NCCL_WITH_UCC=off by default

d0defb3

jiayisuse approved these changes Feb 17, 2022

View reviewed changes

Merge branch 'master' of github.com:pytorch/pytorch into load-ucc.so

b43b831

jiayisuse requested changes Mar 17, 2022

View reviewed changes

zasdfgbnm added 2 commits March 18, 2022 09:53

Merge branch 'master' of github.com:pytorch/pytorch into load-ucc.so

3e077c5

inline

0e710a7

suo removed the ciflow/default label Mar 22, 2022

pytorchmergebot closed this in 3b29bd0 Mar 25, 2022

zasdfgbnm deleted the load-ucc.so branch March 25, 2022 19:26

WBobby mentioned this pull request Aug 17, 2022

Add ROCm5.2.3/AMDGPU support for PyTorch WBobby/pytorch#2

Closed


		namespace c10d {

		std::shared_ptr<at::DynamicLibrary> loadTorchUCC() {

Make ProcessGroupNCCL load torch_ucc.so when TORCH_UCC_LIBRARY_PATH is set #69552

Make ProcessGroupNCCL load torch_ucc.so when TORCH_UCC_LIBRARY_PATH is set #69552

Uh oh!

Conversation

zasdfgbnm commented Dec 7, 2021 • edited by pytorch-probot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Dec 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

💊 CI failures summary and remediations

Uh oh!

pytorch-probot bot commented Dec 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚛️ CI Flow

Uh oh!

pritamdamania87 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zasdfgbnm commented Dec 8, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

If both PyTorch and UCC are dynamically linked to NCCL

If both PyTorch and UCC are statically linked to NCCL

If PyTorch is statically linked to NCCL and UCC is dynamically linked to NCCL

If PyTorch is dynamically linked to NCCL and UCC is statically linked to NCCL

Reference

Uh oh!

mingzhe09088 commented Dec 9, 2021

Uh oh!

zasdfgbnm commented Dec 10, 2021

Uh oh!

zasdfgbnm commented Jan 7, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zasdfgbnm Jan 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zasdfgbnm commented Jan 31, 2022

Uh oh!

facebook-github-bot commented Mar 7, 2022

Uh oh!

zasdfgbnm commented Mar 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Mar 21, 2022

Uh oh!

github-actions bot commented Mar 25, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

zasdfgbnm commented Dec 7, 2021 •

edited by pytorch-probot bot

Loading

facebook-github-bot commented Dec 7, 2021 •

edited

Loading

pytorch-probot bot commented Dec 7, 2021 •

edited

Loading

zasdfgbnm Jan 10, 2022 •

edited

Loading

zasdfgbnm commented Mar 14, 2022 •

edited

Loading