Add alltoall_ to CommTensor #90512

wanchaol · 2022-12-09T00:02:18Z

Stack from ghstack (oldest at bottom):

This PR adds alltoall_ to the CommTensor

This PR adds alltoall_ to the CommTensor [ghstack-poisoned]

pytorch-bot · 2022-12-09T00:02:20Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90512

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 Failures

As of commit b8818e9:

The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

This PR adds alltoall_ to the CommTensor ghstack-source-id: 30e8e84 Pull Request resolved: #90512

This PR adds alltoall_ to the CommTensor [ghstack-poisoned]

This PR adds alltoall_ to the CommTensor ghstack-source-id: 5e71d33 Pull Request resolved: #90512

yifuwang

Nice!

yifuwang · 2022-12-09T04:07:06Z

torch/csrc/distributed/c10d/init.cpp

+                 const std::vector<at::Tensor>& output_tensors,
+                 const std::vector<at::Tensor>& input_tensors) {


Do we expect calling this through pg.alltoall(..) to be backward compatible?

good point, yeah that BC breaking issue is weird, let me investigate a bit what happened there

The BC error is on the return type for alltoall_. Looks like this is recently added into BC check. This should be fine, as it is not a user-facing API, not even user-facing for PG extensions devs.

2022-12-09T03:04:38.6291268Z processing existing schema: c10d::alltoall_(Tensor[] _0, Tensor[] _1, __torch__.torch.classes.c10d.ProcessGroup _2, int _3) -> __torch__.torch.classes.c10d.Work _0 2022-12-09T03:04:38.6292120Z Can NOT find forward compatible schemas after changes for schema c10d::alltoall_(Tensor[] _0, Tensor[] _1, __torch__.torch.classes.c10d.ProcessGroup _2, int _3) -> __torch__.torch.classes.c10d.Work _0 from the following candidates: 2022-12-09T03:04:38.6292185Z [ 2022-12-09T03:04:38.6292525Z c10d::alltoall_(Tensor[] _0, Tensor[] _1, __torch__.torch.classes.c10d.ProcessGroup _2, int _3) -> (Tensor[] _0, __torch__.torch.classes.c10d.Work _1) 2022-12-09T03:04:38.6292634Z ]

Adding const here should be fine, we are not going to change the tensor instance.

Got it, thanks for the insights @mrshenli! Let me update the BC test to skip it :)

mrshenli · 2022-12-09T20:18:47Z

torch/csrc/distributed/c10d/init.cpp

              },
-              py::arg("output"),
-              py::arg("input"),
+              py::arg("output_tensors"),


however, changing the names might be have BC issue (but your change is indeed the right thing to do) for programs that directly call into pg.alltoall(). Shall we do the name change in a separate PR, and mark that one as BC-breaking.

Yeah make sense, updated

mrshenli · 2022-12-09T20:20:12Z

torch/csrc/distributed/c10d/init.cpp

+                 const std::vector<at::Tensor>& output_tensors,
+                 const std::vector<at::Tensor>& input_tensors) {


Adding const here should be fine, we are not going to change the tensor instance.

mrshenli

Stamp to unblock. Please address comments :)

This PR adds alltoall_ to the CommTensor [ghstack-poisoned]

facebook-github-bot · 2022-12-13T04:18:12Z

This pull request has been merged in 3ba9e4c.

Add alltoall_ to CommTensor

820cfc1

This PR adds alltoall_ to the CommTensor [ghstack-poisoned]

wanchaol requested review from H-Huang, awgu, kwen2501, mrshenli, pritamdamania87, rohan-varma and zhaojuanmao as code owners December 9, 2022 00:02

pytorch-bot bot added the release notes: distributed (c10d) release notes category label Dec 9, 2022

wanchaol added a commit that referenced this pull request Dec 9, 2022

Add alltoall_ to CommTensor

1a8dbeb

This PR adds alltoall_ to the CommTensor ghstack-source-id: 30e8e84 Pull Request resolved: #90512

Update on "Add alltoall_ to CommTensor"

4678e56

This PR adds alltoall_ to the CommTensor [ghstack-poisoned]

wanchaol added a commit that referenced this pull request Dec 9, 2022

Add alltoall_ to CommTensor

d7fb6fe

This PR adds alltoall_ to the CommTensor ghstack-source-id: 5e71d33 Pull Request resolved: #90512

wanchaol requested review from aazzolini and yifuwang December 9, 2022 02:41

yifuwang reviewed Dec 9, 2022

View reviewed changes

mrshenli reviewed Dec 9, 2022

View reviewed changes

mrshenli approved these changes Dec 9, 2022

View reviewed changes

Update on "Add alltoall_ to CommTensor"

36ad359

This PR adds alltoall_ to the CommTensor [ghstack-poisoned]

This was referenced Dec 9, 2022

Add reduce_scatter_tensor to CommTensor #90564

Closed

Add allgather_into_tensor to CommTensor #90565

Closed

Update on "Add alltoall_ to CommTensor"

18e9acd

This PR adds alltoall_ to the CommTensor [ghstack-poisoned]

wanchaol mentioned this pull request Dec 9, 2022

[c10d] update alltoall signature to be more consistent #90569

Closed

wanchaol added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 9, 2022

Update on "Add alltoall_ to CommTensor"

4e75996

This PR adds alltoall_ to the CommTensor [ghstack-poisoned]

wanchaol mentioned this pull request Dec 12, 2022

[c10d] remove some outdated bc checks for c10d op #90681

Closed

wanchaol added 3 commits December 12, 2022 20:57

Update on "Add alltoall_ to CommTensor"

84ec3d6

This PR adds alltoall_ to the CommTensor [ghstack-poisoned]

Update on "Add alltoall_ to CommTensor"

4c035c7

This PR adds alltoall_ to the CommTensor [ghstack-poisoned]

Update on "Add alltoall_ to CommTensor"

b8818e9

This PR adds alltoall_ to the CommTensor [ghstack-poisoned]

pytorchmergebot closed this in 3ba9e4c Dec 13, 2022

facebook-github-bot added the Merged label Dec 13, 2022

facebook-github-bot deleted the gh/wanchaol/229/head branch June 8, 2023 19:06

		const std::vector<at::Tensor>& output_tensors,
		const std::vector<at::Tensor>& input_tensors) {

Add alltoall_ to CommTensor #90512

Add alltoall_ to CommTensor #90512

Uh oh!

Conversation

wanchaol commented Dec 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Dec 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90512

❌ 2 Failures

Uh oh!

yifuwang left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mrshenli left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Dec 13, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

wanchaol commented Dec 9, 2022 •

edited

Loading

pytorch-bot bot commented Dec 9, 2022 •

edited

Loading