Introduce missing collectives and small fixes to support local tensor mode in AutoParallel by dzmitry-huba · Pull Request #168110 · pytorch/pytorch

dzmitry-huba · 2025-11-18T21:59:30Z

This PR introduces support for additional functional collectives used in AutoParallel.

Another change is in the semantic of the tolist() on the LocalTensor. Previously LocalTensor would reconcile first and then return a single tensor that is same on all ranks. AutoParallel uses tolist() to compute all-to-all splits during token dispatch and combine.

Stack from ghstack (oldest at bottom):

-> Introduce missing collectives and small fixes to support local tensor mode in AutoParallel #168110

… mode in AutoParallel [ghstack-poisoned]

pytorch-bot · 2025-11-18T21:59:34Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/168110

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit c27928b with merge base fb6af11 ():

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / linux-jammy-py3.10-clang18-asan / test (default, 2, 7, linux.4xlarge) (gh) (trunk failure)
test/complex_tensor/test_complex_tensor.py::TestComplexBwdGradientsCPU::test_fn_grad_diagonal_scatter_cpu_complex128

This comment was automatically generated by Dr. CI and updates every 15 minutes.

… mode in AutoParallel ghstack-source-id: 5fd8ff5 Pull Request resolved: #168110

ezyang · 2025-11-19T03:41:03Z

torch/distributed/_local_tensor/__init__.py



-_LOCAL_TENSOR_MODE: list["LocalTensorMode"] = []
+_GLOBAL_LOCAL_TENSOR_MODE: list["LocalTensorMode"] = []


err, so what's the global mode lol

…ocal tensor mode in AutoParallel" This PR introduces support for additional functional collectives used in AutoParallel. Another change is in the semantic of the tolist() on the LocalTensor. Previously LocalTensor would reconcile first and then return a single tensor that is same on all ranks. AutoParallel uses tolist() to compute all-to-all splits during token dispatch and combine. [ghstack-poisoned]

… mode in AutoParallel ghstack-source-id: 4b51e84 Pull Request resolved: #168110

dzmitry-huba · 2025-11-19T16:27:25Z

@pytorchbot merge

pytorchmergebot · 2025-11-19T16:34:37Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

dzmitry-huba · 2025-11-19T19:27:51Z

@pytorchbot help

pytorch-bot · 2025-11-19T19:27:54Z

❌ 🤖 pytorchbot command failed:

@pytorchbot: error: argument command: invalid choice: 'help' (choose from 'merge', 'revert', 'rebase', 'label', 'drci', 'cherry-pick')

usage: @pytorchbot [-h] {merge,revert,rebase,label,drci,cherry-pick} ...

Try @pytorchbot --help for more info.

dzmitry-huba · 2025-11-19T19:28:26Z

@pytorchbot --help

pytorch-bot · 2025-11-19T19:28:28Z

PyTorchBot Help

usage: @pytorchbot [-h] {merge,revert,rebase,label,drci,cherry-pick} ...

In order to invoke the bot on your PR, include a line that starts with
@pytorchbot anywhere in a comment. That line will form the command; no
multi-line commands are allowed. Some commands may be used on issues as specified below.

Example:
    Some extra context, blah blah, wow this PR looks awesome

    @pytorchbot merge

optional arguments:
  -h, --help            Show this help message and exit.

command:
  {merge,revert,rebase,label,drci,cherry-pick}
    merge               Merge a PR
    revert              Revert a PR
    rebase              Rebase a PR
    label               Add label to a PR
    drci                Update Dr. CI
    cherry-pick         Cherry pick a PR onto a release branch

Merge

usage: @pytorchbot merge [-f MESSAGE | -i] [-ic] [-r [{viable/strict,main}]]

Merge an accepted PR, subject to the rules in .github/merge_rules.json.
By default, this will wait for all required checks (lint, pull) to succeed before merging.

optional arguments:
  -f MESSAGE, --force MESSAGE
                        Merge without checking anything. This requires a reason for auditting purpose, for example:
                        @pytorchbot merge -f 'Minor update to fix lint. Expecting all PR tests to pass'
                        
                        Please use `-f` as last resort, prefer `--ignore-current` to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.
  -i, --ignore-current  Merge while ignoring the currently failing jobs.  Behaves like -f if there are no pending jobs.
  -ic                   Old flag for --ignore-current. Deprecated in favor of -i.
  -r [{viable/strict,main}], --rebase [{viable/strict,main}]
                        Rebase the PR to re run checks before merging.  Accepts viable/strict or main as branch options and will default to viable/strict if not specified.

Revert

usage: @pytorchbot revert -m MESSAGE -c
                          {nosignal,ignoredsignal,landrace,weird,ghfirst,autorevert}

Revert a merged PR. This requires that you are a Meta employee.

Example:
  @pytorchbot revert -m="This is breaking tests on trunk. hud.pytorch.org/" -c=nosignal

optional arguments:
  -m MESSAGE, --message MESSAGE
                        The reason you are reverting, will be put in the commit message. Must be longer than 3 words.
  -c {nosignal,ignoredsignal,landrace,weird,ghfirst,autorevert}, --classification {nosignal,ignoredsignal,landrace,weird,ghfirst,autorevert}
                        A machine-friendly classification of the revert reason.

Rebase

usage: @pytorchbot rebase [-s | -b BRANCH]

Rebase a PR. Rebasing defaults to the stable viable/strict branch of pytorch.
Repeat contributor may use this command to rebase their PR.

optional arguments:
  -s, --stable          [DEPRECATED] Rebase onto viable/strict
  -b BRANCH, --branch BRANCH
                        Branch you would like to rebase to

Label

usage: @pytorchbot label labels [labels ...]

Adds label to a PR or Issue [Can be used on Issues]

positional arguments:
  labels  Labels to add to given Pull Request or Issue [Can be used on Issues]

Dr CI

usage: @pytorchbot drci 

Update Dr. CI. Updates the Dr. CI comment on the PR in case it's gotten out of sync with actual CI results.

cherry-pick

usage: @pytorchbot cherry-pick --onto ONTO [--fixes FIXES] -c
                               {regression,critical,fixnewfeature,docs,release}

Cherry pick a pull request onto a release branch for inclusion in a release

optional arguments:
  --onto ONTO, --into ONTO
                        Branch you would like to cherry pick onto (Example: release/2.1)
  --fixes FIXES         Link to the issue that your PR fixes (Example: https://github.com/pytorch/pytorch/issues/110666)
  -c {regression,critical,fixnewfeature,docs,release}, --classification {regression,critical,fixnewfeature,docs,release}
                        A machine-friendly classification of the cherry-pick reason.

pytorchmergebot · 2025-11-19T19:29:22Z

This PR (#168110) was merged in 6c02dde but it is still open, likely due to a Github bug, so mergebot is closing it manually. If you think this is a mistake, please feel free to reopen and contact Dev Infra.

… mode in AutoParallel (pytorch#168110) This PR introduces support for additional functional collectives used in AutoParallel. Another change is in the semantic of the tolist() on the LocalTensor. Previously LocalTensor would reconcile first and then return a single tensor that is same on all ranks. AutoParallel uses tolist() to compute all-to-all splits during token dispatch and combine. Pull Request resolved: pytorch#168110 Approved by: https://github.com/ezyang

Introduce missing collectives and small fixes to support local tensor…

95b25df

… mode in AutoParallel [ghstack-poisoned]

pytorch-bot bot added the release notes: distributed (c10d) release notes category label Nov 18, 2025

dzmitry-huba added a commit that referenced this pull request Nov 18, 2025

Introduce missing collectives and small fixes to support local tensor…

fe47290

… mode in AutoParallel ghstack-source-id: 5fd8ff5 Pull Request resolved: #168110

dzmitry-huba requested review from ezyang and wconstab November 18, 2025 23:08

dzmitry-huba marked this pull request as ready for review November 18, 2025 23:15

ezyang reviewed Nov 19, 2025

View reviewed changes

ezyang approved these changes Nov 19, 2025

View reviewed changes

dzmitry-huba added a commit that referenced this pull request Nov 19, 2025

Introduce missing collectives and small fixes to support local tensor…

c5657f5

… mode in AutoParallel ghstack-source-id: 4b51e84 Pull Request resolved: #168110

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 19, 2025

pytorchmergebot added the merging label Nov 19, 2025

pytorchmergebot added the Merged label Nov 19, 2025

pytorchmergebot closed this Nov 19, 2025

pytorchmergebot removed the merging label Nov 19, 2025

github-actions bot deleted the gh/dzmitry-huba/13/head branch December 20, 2025 02:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce missing collectives and small fixes to support local tensor mode in AutoParallel#168110

Introduce missing collectives and small fixes to support local tensor mode in AutoParallel#168110
dzmitry-huba wants to merge 2 commits intogh/dzmitry-huba/13/basefrom
gh/dzmitry-huba/13/head

dzmitry-huba commented Nov 18, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Nov 18, 2025 •

edited

Loading

Uh oh!

ezyang Nov 19, 2025

Uh oh!

dzmitry-huba commented Nov 19, 2025

Uh oh!

pytorchmergebot commented Nov 19, 2025

Uh oh!

dzmitry-huba commented Nov 19, 2025

Uh oh!

pytorch-bot bot commented Nov 19, 2025

Uh oh!

dzmitry-huba commented Nov 19, 2025

Uh oh!

pytorch-bot bot commented Nov 19, 2025

Uh oh!

pytorchmergebot commented Nov 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants



		_LOCAL_TENSOR_MODE: list["LocalTensorMode"] = []
		_GLOBAL_LOCAL_TENSOR_MODE: list["LocalTensorMode"] = []

Conversation

dzmitry-huba commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/168110

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

ezyang Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

dzmitry-huba commented Nov 19, 2025

Uh oh!

pytorchmergebot commented Nov 19, 2025

Merge started

Uh oh!

dzmitry-huba commented Nov 19, 2025

Uh oh!

pytorch-bot bot commented Nov 19, 2025

Uh oh!

dzmitry-huba commented Nov 19, 2025

Uh oh!

pytorch-bot bot commented Nov 19, 2025

PyTorchBot Help

Merge

Revert

Rebase

Label

Dr CI

cherry-pick

Uh oh!

pytorchmergebot commented Nov 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dzmitry-huba commented Nov 18, 2025 •

edited

Loading

pytorch-bot bot commented Nov 18, 2025 •

edited

Loading