Skip to content

extend C++ DTensor fast path to local operator dispatch#166808

Closed
swolchok wants to merge 13 commits intogh/swolchok/865/basefrom
gh/swolchok/865/head
Closed

extend C++ DTensor fast path to local operator dispatch#166808
swolchok wants to merge 13 commits intogh/swolchok/865/basefrom
gh/swolchok/865/head

Conversation

@pytorch-bot
Copy link

pytorch-bot bot commented Nov 1, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166808

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit b936c51 with merge base 780e325 (image):

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

… path to local operator dispatch"

cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
swolchok added a commit that referenced this pull request Nov 3, 2025
ghstack-source-id: 0b5fc22
Pull Request resolved: #166808
@swolchok swolchok added the release notes: distributed (dtensor) release notes category label Nov 3, 2025
…ispatch"

cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
swolchok added a commit that referenced this pull request Nov 4, 2025
ghstack-source-id: a80e3f6
Pull Request resolved: #166808
…dispatch"

cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
swolchok added a commit that referenced this pull request Nov 6, 2025
ghstack-source-id: 1ede132
Pull Request resolved: #166808
… local operator dispatch"

cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
swolchok added a commit that referenced this pull request Nov 7, 2025
ghstack-source-id: 8e0285c
Pull Request resolved: #166808
cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
…path to local operator dispatch"

cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
swolchok added a commit that referenced this pull request Nov 10, 2025
…DTensor fast path: port return_and_correct_aliasing and inplace/out checks"

This seems to generate a several-microsecond performance improvement in the detach benchmark I've been using.

cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
swolchok added a commit that referenced this pull request Nov 10, 2025
…extend C++ fast path to local operator dispatch"

cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
swolchok added a commit that referenced this pull request Nov 10, 2025
…h: port return_and_correct_aliasing and inplace/out checks"

This seems to generate a several-microsecond performance improvement in the detach benchmark I've been using.

cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
…t arguments for local dispatch, and failure to return a list (was pushing multiple retvals onto stack) for list returning ops on "WIP: extend C++ fast path to local operator dispatch"

cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
swolchok added a commit that referenced this pull request Nov 11, 2025
…ptionalTensorList arguments for local dispatch, and failure to return a list (was pushing multiple retvals onto stack) for list returning ops on "WIP: extend C++ fast path to local operator dispatch"

cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
swolchok added a commit that referenced this pull request Nov 11, 2025
…ptionalTensorList arguments for local dispatch, and failure to return a list (was pushing multiple retvals onto stack) for list returning ops on "WIP: DTensor fast path: port return_and_correct_aliasing and inplace/out checks"

This seems to generate a several-microsecond performance improvement in the detach benchmark I've been using.

cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
swolchok added a commit that referenced this pull request Nov 11, 2025
…t arguments for local dispatch, and failure to return a list (was pushing multiple retvals onto stack) for list returning ops on "WIP: DTensor fast path: port return_and_correct_aliasing and inplace/out checks"

This seems to generate a several-microsecond performance improvement in the detach benchmark I've been using.

cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
@swolchok swolchok changed the title WIP: extend C++ fast path to local operator dispatch extend C++ DTensor fast path to local operator dispatch Nov 11, 2025
@swolchok swolchok marked this pull request as ready for review November 11, 2025 06:18
…o local operator dispatch"

cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
@swolchok swolchok requested review from XilunWu and wconstab November 11, 2025 23:05
@swolchok swolchok added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 11, 2025
… local operator dispatch"

cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
@ezyang ezyang requested a review from zpcore November 12, 2025 03:51
…comments on "extend C++ DTensor fast path to local operator dispatch"

cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
…ensor on "extend C++ DTensor fast path to local operator dispatch"

cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
…ast path to local operator dispatch"

cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k pragupta msaroufim dcci

[ghstack-poisoned]
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #167475

pytorchmergebot pushed a commit that referenced this pull request Nov 13, 2025
…hecks (#167475)

This seems to generate a several-microsecond performance improvement in the detach benchmark I've been using.

Pull Request resolved: #167475
Approved by: https://github.com/ezyang
ghstack dependencies: #167051, #166372, #166808
Khanaksahu pushed a commit to Khanaksahu/pytorch that referenced this pull request Nov 17, 2025
Silv3S pushed a commit to Silv3S/pytorch that referenced this pull request Nov 18, 2025
Silv3S pushed a commit to Silv3S/pytorch that referenced this pull request Nov 18, 2025
…hecks (pytorch#167475)

This seems to generate a several-microsecond performance improvement in the detach benchmark I've been using.

Pull Request resolved: pytorch#167475
Approved by: https://github.com/ezyang
ghstack dependencies: pytorch#167051, pytorch#166372, pytorch#166808
pytorchmergebot pushed a commit that referenced this pull request Nov 21, 2025
```
git revert --no-commit 567dcdb 200156e 3d801a4 2034ca9 480b4ff f570e58
```

    And Revert "[DTensor] Document fast-path dispatch (#168192)"
    And Revert "[DTensor] Fix deadlock after fast cache clear (#168069)"

Reverts:
* #167860
* #167588
* #167475
* #166808
* #166372
* #168192
* #168069

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: #168264
Approved by: https://github.com/seemethere, https://github.com/malfet
JacobSzwejbka pushed a commit that referenced this pull request Dec 8, 2025
```
git revert --no-commit 567dcdb 200156e 3d801a4 2034ca9 480b4ff f570e58
```

    And Revert "[DTensor] Document fast-path dispatch (#168192)"
    And Revert "[DTensor] Fix deadlock after fast cache clear (#168069)"

Reverts:
* #167860
* #167588
* #167475
* #166808
* #166372
* #168192
* #168069

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: #168264
Approved by: https://github.com/seemethere, https://github.com/malfet
@github-actions github-actions bot deleted the gh/swolchok/865/head branch December 14, 2025 02:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged oncall: distributed Add this issue/PR to distributed oncall triage queue release notes: distributed (dtensor) release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants