[PT-D][Tensor parallelism] Add documentations for TP #94421

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

fduwjj wants to merge 4 commits into gh/fduwjj/71/base from gh/fduwjj/71/head

Contributor

fduwjj commented Feb 8, 2023 •

edited

Loading

Stack from ghstack (oldest at bottom):

-> [PT-D][Tensor parallelism] Add documentations for TP #94421

This is far from completed and we will definitely polish it down the road.


          [PT-D][Tensor parallelism] Add documentations for TP

7c58fdd

[ghstack-poisoned]

pytorch-bot bot commented Feb 8, 2023 •

edited

Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/94421

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8d99683:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

fduwjj marked this pull request as draft

February 8, 2023 18:24

fduwjj added the release notes: distributed (dtensor) label

fduwjj marked this pull request as ready for review

February 8, 2023 18:26

fduwjj added the ciflow/trunk label

fduwjj requested a review from wanchaol

February 8, 2023 18:27

fduwjj marked this pull request as draft

February 8, 2023 18:29


          Update on "[PT-D][Tensor parallelism] Add documentations for TP"

ef98f57

[ghstack-poisoned]

fduwjj marked this pull request as ready for review

February 8, 2023 20:17

fduwjj requested review from XilunWu, kumpera and wz337

February 8, 2023 20:17

wz337 approved these changes

View reviewed changes

Contributor

wz337 left a comment

LGTM

docs/source/distributed.tensor.parallel.rst Outdated Show resolved Hide resolved

wanchaol reviewed

View reviewed changes

Collaborator

wanchaol left a comment

first pass, let's add the experimental line as this is prototype release.

docs/source/distributed.tensor.parallel.rst Outdated Show resolved Hide resolved

docs/source/distributed.tensor.parallel.rst Show resolved Hide resolved

docs/source/distributed.tensor.parallel.rst

    
                :members:

              We also enabled 2D parallelism to integrate with ``FullyShardedDataParallel``.

              Users just need to call the following API explicitly:

Collaborator

wanchaol Feb 8, 2023

I remembered we have a FSDP extension, Is TP automatically register the extension now?

Also, I wonder if we should give a small code snippet showing how the 2-D parallel look like

Contributor Author

fduwjj Feb 8, 2023 •

edited

Loading

The registrations is in the is_available. Let me send a follow-up PR for this one.

docs/source/distributed.tensor.parallel.rst

    
              .. currentmodule:: torch.distributed.tensor.parallel.fsdp

              .. autofunction::  is_available

Collaborator

wanchaol Feb 8, 2023

do we really need to add this API to the doc? I remembered is_available is introduced when we are in tau, but since now it's pytorch I think fsdp should always be available?

Contributor Author

fduwjj Feb 8, 2023

Yes, because of 2D hook registration.

Contributor Author

fduwjj Feb 8, 2023

Will send a follow-up PR to address the naming of this one.

docs/source/distributed.tensor.parallel.rst Outdated Show resolved Hide resolved

docs/source/distributed.tensor.parallel.rst Outdated Show resolved Hide resolved

docs/source/distributed.tensor.parallel.rst Outdated Show resolved Hide resolved

docs/source/distributed.tensor.parallel.rst Outdated Show resolved Hide resolved

docs/source/distributed.tensor.parallel.rst Outdated Show resolved Hide resolved

docs/source/distributed.tensor.parallel.rst Outdated Show resolved Hide resolved


          Update on "[PT-D][Tensor parallelism] Add documentations for TP"

596c896


This is far from completed and we will definitely polish it down the road.


[ghstack-poisoned]

fduwjj requested review from H-Huang, awgu, kwen2501, mrshenli, rohan-varma and zhaojuanmao as code owners

February 8, 2023 22:37

Contributor Author

fduwjj commented Feb 8, 2023

@pytorchbot rebase

Collaborator

pytorchmergebot commented Feb 8, 2023

@pytorchbot successfully started a rebase job. Check the current status here


          Update on "[PT-D][Tensor parallelism] Add documentations for TP"

8d99683


This is far from completed and we will definitely polish it down the road.


[ghstack-poisoned]

Collaborator

pytorchmergebot commented Feb 8, 2023

Successfully rebased gh/fduwjj/71/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/94421)

pytorchmergebot pushed a commit that referenced this pull request


          [PT-D][Tensor parallelism] Add documentations for TP

eacbb63

ghstack-source-id: d03f0b1
Pull Request resolved: #94421

Contributor Author

fduwjj commented Feb 8, 2023

@pytorchbot merge

Collaborator

pytorchmergebot commented Feb 8, 2023

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot added the Merged label

pytorchmergebot closed this in

41e3189

facebook-github-bot deleted the gh/fduwjj/71/head branch

June 8, 2023 17:12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

wanchaol wanchaol left review comments

wz337 wz337 approved these changes

kumpera Awaiting requested review from kumpera

XilunWu Awaiting requested review from XilunWu

mrshenli Awaiting requested review from mrshenli

zhaojuanmao Awaiting requested review from zhaojuanmao

rohan-varma Awaiting requested review from rohan-varma

H-Huang Awaiting requested review from H-Huang

awgu Awaiting requested review from awgu

kwen2501 Awaiting requested review from kwen2501

Labels

ciflow/trunk Merged release notes: distributed (dtensor)