Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/164517
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ No FailuresAs of commit 2f475da with merge base a707042 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
Do we drop the support for cuda version before 12.3? Since otherwise, something like: need to be added after |
|
@pytorchbot revert -m="Diff reverted internally" -c="ghfirst" This Pull Request has been reverted by a revert inside Meta. To re-land this change, please open another pull request, assign the same reviewers, fix the CI failures that caused the revert and make sure that the failing CI runs on the PR by applying the proper ciflow label (e.g., ciflow/trunk).) |
|
@pytorchbot successfully started a revert job. Check the current status here. |
This reverts commit d1cbb74. Reverted #164517 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](#164517 (comment)))
|
@kwen2501 your PR has been successfully reverted. |
|
@shunting314 The code you cited is merely host-side code, without fancy CUDA APIs. Which part is not buildable? Can you share your error log? |
|
I don't have the log available. But it says multimem_one_shot_reduce_out is not defined since the definition is in a conditional block guarded by the CUDA 12.3 check. |
|
I see, thank you @shunting314 |
|
Added the temporary workaround as suggested. |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Modified `multimem_one_shot_all_reduce_out` function to accept a `root` argument, making it a `multimem_reduce` op. The original `multimem_one_shot_all_reduce` op becomes a caller of the `multimem_reduce`, with each rank providing its own rank id as root. Pull Request resolved: pytorch#164517 Approved by: https://github.com/ngimel
This reverts commit d1cbb74. Reverted pytorch#164517 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](pytorch#164517 (comment)))
Modified `multimem_one_shot_all_reduce_out` function to accept a `root` argument, making it a `multimem_reduce` op. The original `multimem_one_shot_all_reduce` op becomes a caller of the `multimem_reduce`, with each rank providing its own rank id as root. Pull Request resolved: pytorch#164517 Approved by: https://github.com/ngimel
Stack from ghstack (oldest at bottom):
Modified
multimem_one_shot_all_reduce_outfunction to accept arootargument, making it amultimem_reduceop.The original
multimem_one_shot_all_reduceop becomes a caller of themultimem_reduce, with each rank providing its own rank id as root.cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta @msaroufim @dcci