Add python bindings for NCCL CTA policies#164309
Add python bindings for NCCL CTA policies#164309lakshayg wants to merge 2 commits intopytorch:mainfrom
Conversation
NCCLConfig can now be constructed with non-default cta policies ```python import torch from torch.distributed import ProcessGroupNCCL as nccl config = nccl.NCCLConfig() config.cta_policy = nccl.NCCL_CTA_POLICY_ZERO # NCCL version >= 2.28 ```
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/164309
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ No FailuresAs of commit b79e2a4 with merge base 60f0a35 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Whoops, pulled the trigger to quickly.
| processGroupNCCL.def_property_readonly_static( | ||
| "NCCL_CTA_POLICY_EFFICIENCY", | ||
| [](const py::object&) { return NCCL_CTA_POLICY_EFFICIENCY; }); | ||
| #ifdef NCCL_CTA_POLICY_ZERO // requires NCCL version >= 2.28 |
There was a problem hiding this comment.
Where is this defined? I don't see NCCL_CTA_POLICY_ZERO defined anywhere in PyTorch or the NCCL github. Might as well set it conditionally based on NCCL version
There was a problem hiding this comment.
There was a problem hiding this comment.
@Skylion007 Can this comment be resolved or would you prefer I change it to use NCCL version?
|
@pytorchmergebot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
NCCLConfig can now be constructed with non-default [cta policies][1] ```python import torch from torch.distributed import ProcessGroupNCCL as nccl config = nccl.NCCLConfig() config.cta_policy = nccl.NCCL_CTA_POLICY_ZERO # NCCL version >= 2.28 ``` [1]: https://docs.nvidia.com/deeplearning/nccl/archives/nccl_2283/user-guide/docs/api/flags.html#nccl-communicator-cta-policy-flags Pull Request resolved: pytorch#164309 Approved by: https://github.com/eqy
NCCLConfig can now be constructed with non-default cta policies
cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta @msaroufim @dcci