Skip to content

Conversation

@fduwjj
Copy link
Contributor

@fduwjj fduwjj commented Feb 8, 2023

Stack from ghstack (oldest at bottom):

This is far from completed and we will definitely polish it down the road.

@pytorch-bot
Copy link

pytorch-bot bot commented Feb 8, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/94421

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8d99683:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@fduwjj fduwjj marked this pull request as draft February 8, 2023 18:24
@fduwjj fduwjj added the release notes: distributed (dtensor) release notes category label Feb 8, 2023
@fduwjj fduwjj marked this pull request as ready for review February 8, 2023 18:26
@fduwjj fduwjj added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 8, 2023
@fduwjj fduwjj requested a review from wanchaol February 8, 2023 18:27
@fduwjj fduwjj marked this pull request as draft February 8, 2023 18:29
@fduwjj fduwjj marked this pull request as ready for review February 8, 2023 20:17
@fduwjj fduwjj requested review from XilunWu, kumpera and wz337 February 8, 2023 20:17
Copy link
Contributor

@wz337 wz337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@wanchaol wanchaol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first pass, let's add the experimental line as this is prototype release.

:members:

We also enabled 2D parallelism to integrate with ``FullyShardedDataParallel``.
Users just need to call the following API explicitly:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remembered we have a FSDP extension, Is TP automatically register the extension now?

Also, I wonder if we should give a small code snippet showing how the 2-D parallel look like

Copy link
Contributor Author

@fduwjj fduwjj Feb 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The registrations is in the is_available. Let me send a follow-up PR for this one.



.. currentmodule:: torch.distributed.tensor.parallel.fsdp
.. autofunction:: is_available
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need to add this API to the doc? I remembered is_available is introduced when we are in tau, but since now it's pytorch I think fsdp should always be available?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because of 2D hook registration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will send a follow-up PR to address the naming of this one.


This is far from completed and we will definitely polish it down the road.


[ghstack-poisoned]
@fduwjj
Copy link
Contributor Author

fduwjj commented Feb 8, 2023

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a rebase job. Check the current status here


This is far from completed and we will definitely polish it down the road.


[ghstack-poisoned]
@pytorchmergebot
Copy link
Collaborator

Successfully rebased gh/fduwjj/71/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/94421)

pytorchmergebot pushed a commit that referenced this pull request Feb 8, 2023
ghstack-source-id: d03f0b1
Pull Request resolved: #94421
@fduwjj
Copy link
Contributor Author

fduwjj commented Feb 8, 2023

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@facebook-github-bot facebook-github-bot deleted the gh/fduwjj/71/head branch June 8, 2023 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: distributed (dtensor) release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants