[dtensor][4/N] refactor dispatching logic and add propagator #90733

wanchaol · 2022-12-13T01:01:25Z

Stack from ghstack (oldest at bottom):

This PR refactors the dispatching logic to make it more clean, and
isolate the sharding propagation logic out to a separate class.

This is so that we can implement more complicated propagation features
later.

Differential Revision: D42876251

This PR refactors the dispatching logic to make it more clean, and isolate the sharding propagation logic out to a separate class. This is so that we can implement more complicated propagation features later. [ghstack-poisoned]

pytorch-bot · 2022-12-13T01:01:28Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90733

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 213b785:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

This PR refactors the dispatching logic to make it more clean, and isolate the sharding propagation logic out to a separate class. This is so that we can implement more complicated propagation features later. [ghstack-poisoned]

XilunWu

Great work! Thx for refactoring op dispatching and sharding propagation facilities. some typos to fix.

XilunWu · 2023-01-18T22:34:42Z

torch/distributed/_tensor/dispatch.py

+    if op_call in _CURRENT_DECOMPOSITION_TABLE:
+        return _CURRENT_DECOMPOSITION_TABLE[op_call](*args, **kwargs)
+
+    # STEP 0. See if threre're user defined custom aten operator


typo: threre're

XilunWu · 2023-01-18T22:55:10Z

torch/distributed/_tensor/dispatch.py

+    # implementations. Custom operators take the highest priority
+    if custom_dispatch_ops is not None and str(op_call) in custom_dispatch_ops:
+        # dispatch to user defined custom distributed tensor ops
+        return custom_dispatch_ops[str(op_call)](*args, **kwargs)


question: will this custom_dispatch_ops be deprecated once register_impl is no longer needed? I assume that eventually we want to get rid of register_impl and fully adopt propagation rules.

Yes we should deprecate this once we move all ops to use propagation rules

XilunWu · 2023-01-18T23:29:16Z

torch/distributed/_tensor/prop.py

+        if sharding_prop_func is None:
+            # step 1. If there's not even one sharding rule
+            # implemented for the operator, we fall back to
+            # local tensor compute, this is wront currently


typo: wront

XilunWu · 2023-01-18T23:29:50Z

torch/distributed/_tensor/prop.py

+            # implemented for the operator, we fall back to
+            # local tensor compute, this is wront currently
+            # we will change the behavior to reshard to full
+            # replicate and do the computatation


typo: computatation

This PR refactors the dispatching logic to make it more clean, and isolate the sharding propagation logic out to a separate class. This is so that we can implement more complicated propagation features later. [ghstack-poisoned]

wanchaol · 2023-01-31T02:39:31Z

@wanchaol has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

This PR refactors the dispatching logic to make it more clean, and isolate the sharding propagation logic out to a separate class. This is so that we can implement more complicated propagation features later. Differential Revision: [D42876251](https://our.internmc.facebook.com/intern/diff/D42876251) [ghstack-poisoned]

wanchaol · 2023-01-31T16:15:59Z

@wanchaol has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

This PR refactors the dispatching logic to make it more clean, and isolate the sharding propagation logic out to a separate class. This is so that we can implement more complicated propagation features later. Differential Revision: [D42876251](https://our.internmc.facebook.com/intern/diff/D42876251) [ghstack-poisoned]

fduwjj

LGTM

facebook-github-bot · 2023-02-01T05:00:16Z

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

pytorchmergebot · 2023-02-01T05:02:09Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

wanchaol requested review from H-Huang, awgu, kwen2501, mrshenli, pritamdamania87, rohan-varma and zhaojuanmao as code owners December 13, 2022 01:01

wanchaol added the release notes: distributed (dtensor) release notes category label Dec 20, 2022

wanchaol added 6 commits December 29, 2022 19:43

wanchaol mentioned this pull request Jan 6, 2023

[dtensor][2/N] add __repr__ to placements #91785

Closed

wanchaol changed the title ~~[dtensor][3/N] refactor dispatching logic and add propagator~~ [dtensor][4/N] refactor dispatching logic and add propagator Jan 6, 2023

wanchaol requested review from XilunWu, aazzolini and fduwjj and removed request for pritamdamania87 January 6, 2023 23:07

wanchaol mentioned this pull request Jan 13, 2023

[WIP] prototype: op decomposition #92126

Closed

wanchaol added 2 commits January 15, 2023 08:12

XilunWu approved these changes Jan 18, 2023

View reviewed changes

wanchaol added 3 commits January 19, 2023 00:39

This was referenced Jan 24, 2023

add numpy typing plugin to mypy config #92930

Closed

[dtensor][8/N] switch DeviceMesh to use numpy array for devices #92931

Closed

wanchaol mentioned this pull request Jan 26, 2023

[dtensor][7/N] remove backend in with_comms #93040

Closed

wanchaol added 6 commits January 26, 2023 04:05

wanchaol added 2 commits January 31, 2023 08:26

fduwjj approved these changes Jan 31, 2023

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 1, 2023

pytorchmergebot added the Merged label Feb 1, 2023

pytorchmergebot closed this in b072245 Feb 1, 2023

facebook-github-bot deleted the gh/wanchaol/237/head branch June 8, 2023 19:07

[dtensor][4/N] refactor dispatching logic and add propagator #90733

[dtensor][4/N] refactor dispatching logic and add propagator #90733

Uh oh!

Conversation

wanchaol commented Dec 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Dec 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90733

✅ No Failures

Uh oh!

XilunWu left a comment

Choose a reason for hiding this comment

Uh oh!

XilunWu Jan 18, 2023

Choose a reason for hiding this comment

Uh oh!

XilunWu Jan 18, 2023

Choose a reason for hiding this comment

Uh oh!

wanchaol Jan 19, 2023

Choose a reason for hiding this comment

Uh oh!

XilunWu Jan 18, 2023

Choose a reason for hiding this comment

Uh oh!

XilunWu Jan 18, 2023

Choose a reason for hiding this comment

Uh oh!

wanchaol commented Jan 31, 2023

Uh oh!

wanchaol commented Jan 31, 2023

Uh oh!

fduwjj left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Feb 1, 2023

Uh oh!

pytorchmergebot commented Feb 1, 2023

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

wanchaol commented Dec 13, 2022 •

edited

Loading

pytorch-bot bot commented Dec 13, 2022 •

edited

Loading