[DTensor] add explicit mode (ExplicitRedistributionContext) by wconstab · Pull Request #166593 · pytorch/pytorch

wconstab · 2025-10-29T23:32:38Z

Stack from ghstack (oldest at bottom):

-> [DTensor] add explicit mode (ExplicitRedistributionContext) #166593

usage:

dx = distribute_tensor(x, device_mesh, [Shard(0)])
dA = distribute_tensor(A, device_mesh, [Shard(0)])
with ExplicitRedistributionContext():
    with self.assertRaisesRegex(RuntimeError, "Implicit redistribution"):
        # Shard(0) @ Shard(0) requires a redistribution
        torch.matmul(dx, dA)

cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @d4l3k @pragupta @msaroufim @dcci

[ghstack-poisoned]

pytorch-bot · 2025-10-29T23:32:41Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166593

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 9eab1bf with merge base 397d9fe ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 5466441 Pull Request resolved: #166593

kwen2501 · 2025-10-30T07:12:18Z

This is at least better than using parallelize_module to achieve the parallelization of an inner op.
(Just by reading this statement you would probably sense where the mismatch is.)

An op is hard to target from user script, because unlike parameters, they don't have FQNs.

So a user would either have to use a graph based approach, where the ops become nodes (which you would have a handle to). But this approach requires the user to use graph passes to parallelize the program.

Or the user can pass through the coat of nn.Module, and directly interact with the ops eagerly, like you did.

cc H-Huang awgu wanchaol fegin fduwjj wz337 d4l3k pragupta msaroufim dcci [ghstack-poisoned]

ghstack-source-id: 48269f6 Pull Request resolved: #166593

cc H-Huang awgu wanchaol fegin fduwjj wz337 d4l3k pragupta msaroufim dcci [ghstack-poisoned]

ghstack-source-id: fb6a16d Pull Request resolved: #166593 st

wconstab · 2025-11-06T21:38:31Z

@pytorchbot merge

pytorchmergebot · 2025-11-06T21:40:32Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…torch#167370) Also support nesting, enable/disable, and make the class use a thread-local for storage so independent threads do not confuse each other. Pull Request resolved: pytorch#167370 Approved by: https://github.com/ezyang ghstack dependencies: pytorch#166593

…166593) usage: ``` dx = distribute_tensor(x, device_mesh, [Shard(0)]) dA = distribute_tensor(A, device_mesh, [Shard(0)]) with ExplicitRedistributionContext(): with self.assertRaisesRegex(RuntimeError, "Implicit redistribution"): # Shard(0) @ Shard(0) requires a redistribution torch.matmul(dx, dA) ``` Pull Request resolved: pytorch#166593 Approved by: https://github.com/ezyang

…torch#167370) Also support nesting, enable/disable, and make the class use a thread-local for storage so independent threads do not confuse each other. Pull Request resolved: pytorch#167370 Approved by: https://github.com/ezyang ghstack dependencies: pytorch#166593

Prototype DTensor explicit mode

30110bd

[ghstack-poisoned]

pytorch-bot bot added the ciflow/inductor label Oct 29, 2025

wconstab added a commit that referenced this pull request Oct 29, 2025

Prototype DTensor explicit mode

1efeca8

ghstack-source-id: 5466441 Pull Request resolved: #166593

pytorch-bot bot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Oct 29, 2025

Update on "Prototype DTensor explicit mode"

0a147ee

cc H-Huang awgu wanchaol fegin fduwjj wz337 d4l3k pragupta msaroufim dcci [ghstack-poisoned]

wconstab added a commit that referenced this pull request Nov 5, 2025

Prototype DTensor explicit mode

1a2c649

ghstack-source-id: 48269f6 Pull Request resolved: #166593

Update on "Prototype DTensor explicit mode"

9eab1bf

cc H-Huang awgu wanchaol fegin fduwjj wz337 d4l3k pragupta msaroufim dcci [ghstack-poisoned]

wconstab added a commit that referenced this pull request Nov 5, 2025

Prototype DTensor explicit mode

8807562

ghstack-source-id: fb6a16d Pull Request resolved: #166593 st

wconstab changed the title ~~Prototype DTensor explicit mode~~ [DTensor] add explicit mode (ExplicitRedistributionContext) Nov 6, 2025

wconstab added the release notes: distributed (dtensor) release notes category label Nov 6, 2025

wconstab requested review from ezyang and tianyu-l November 6, 2025 00:38

ezyang approved these changes Nov 6, 2025

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 6, 2025

pytorchmergebot added the merging label Nov 6, 2025

pytorchmergebot added the Merged label Nov 7, 2025

pytorchmergebot closed this in 2923b02 Nov 7, 2025

pytorchmergebot removed the merging label Nov 7, 2025

wconstab mentioned this pull request Nov 7, 2025

[DTensor] Make ExplicitRedistributeContext strict/non-strict mode #167370

Closed

github-actions bot deleted the gh/wconstab/448/head branch December 7, 2025 02:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DTensor] add explicit mode (ExplicitRedistributionContext)#166593

[DTensor] add explicit mode (ExplicitRedistributionContext)#166593
wconstab wants to merge 3 commits intogh/wconstab/448/basefrom
gh/wconstab/448/head

wconstab commented Oct 29, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 29, 2025 •

edited

Loading

Uh oh!

kwen2501 commented Oct 30, 2025 •

edited

Loading

Uh oh!

wconstab commented Nov 6, 2025

Uh oh!

pytorchmergebot commented Nov 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

wconstab commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166593

✅ No Failures

Uh oh!

kwen2501 commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wconstab commented Nov 6, 2025

Uh oh!

pytorchmergebot commented Nov 6, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wconstab commented Oct 29, 2025 •

edited

Loading

pytorch-bot bot commented Oct 29, 2025 •

edited

Loading

kwen2501 commented Oct 30, 2025 •

edited

Loading