-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[PT-D][Tensor parallelism] Add documentations for TP #94421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/94421
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 8d99683: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
[ghstack-poisoned]
wz337
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
wanchaol
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first pass, let's add the experimental line as this is prototype release.
| :members: | ||
|
|
||
| We also enabled 2D parallelism to integrate with ``FullyShardedDataParallel``. | ||
| Users just need to call the following API explicitly: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remembered we have a FSDP extension, Is TP automatically register the extension now?
Also, I wonder if we should give a small code snippet showing how the 2-D parallel look like
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The registrations is in the is_available. Let me send a follow-up PR for this one.
|
|
||
|
|
||
| .. currentmodule:: torch.distributed.tensor.parallel.fsdp | ||
| .. autofunction:: is_available |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really need to add this API to the doc? I remembered is_available is introduced when we are in tau, but since now it's pytorch I think fsdp should always be available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because of 2D hook registration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will send a follow-up PR to address the naming of this one.
This is far from completed and we will definitely polish it down the road. [ghstack-poisoned]
|
@pytorchbot rebase |
|
@pytorchbot successfully started a rebase job. Check the current status here |
This is far from completed and we will definitely polish it down the road. [ghstack-poisoned]
|
Successfully rebased |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Stack from ghstack (oldest at bottom):
This is far from completed and we will definitely polish it down the road.