[FSDP] Use `reduce_scatter_tensor()` #87240

awgu · 2022-10-18T22:00:35Z

Stack from ghstack:

[FSDP] Use reduce_scatter_tensor() #87240 [FSDP] Use reduce_scatter_tensor()
[FSDP][2/N] Fix grad zero vs. None edge case #87308 [FSDP][2/N] Fix grad zero vs. None edge case
[FSDP][1/N] Update summon_full_params(with_grads) None gradient #87314 [FSDP][1/N] Update summon_full_params(with_grads) None gradient

Let us silence some more warnings 👍🏼

[ghstack-poisoned]

pytorch-bot · 2022-10-18T22:00:38Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/87240

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit fe79b20:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 1b31017 Pull Request resolved: #87240

zhaojuanmao

somehow did not see it replaces the _reduce_scatter_base() call in fully_sharded_data_parallel.py?

awgu · 2022-10-21T11:19:03Z

somehow did not see it replaces the _reduce_scatter_base() call in fully_sharded_data_parallel.py?

After Olga's communication hook work, the reduce-scatter is now the one in torch/distributed/algorithms/_comm_hooks/default_hooks.py.

@rohan-varma mentioned maybe the call is too non-obvious now. We can refactor later.

rohan-varma

LGTM. agree current situation is non-obvious, we should consider refactor.

awgu · 2022-10-24T03:28:17Z

@pytorchbot rebase -s

pytorchmergebot · 2022-10-24T03:30:09Z

@pytorchbot successfully started a rebase job. Check the current status here

pytorchmergebot · 2022-10-24T03:30:15Z

Rebase failed due to Command git -C /home/runner/work/pytorch/pytorch rebase refs/remotes/origin/viable/strict gh/awgu/133/orig returned non-zero exit code 1

Rebasing (1/1)
Auto-merging test/distributed/fsdp/test_fsdp_comm.py
CONFLICT (content): Merge conflict in test/distributed/fsdp/test_fsdp_comm.py
error: could not apply e4eb7466c1... [FSDP] Use `reduce_scatter_tensor()`
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
Could not apply e4eb7466c1... [FSDP] Use `reduce_scatter_tensor()`

Raised by https://github.com/pytorch/pytorch/actions/runs/3309788180

Let us silence some more warnings 👍🏼 [ghstack-poisoned]

ghstack-source-id: 6238433 Pull Request resolved: #87240

awgu · 2022-10-24T11:27:51Z

@pytorchbot merge

pytorchmergebot · 2022-10-24T11:29:18Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

github-actions · 2022-10-24T11:30:01Z

Hey @awgu.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

Let us silence some more warnings 👍🏼 Pull Request resolved: pytorch#87240 Approved by: https://github.com/rohan-varma

[FSDP] Use reduce_scatter_tensor()

512ced6

[ghstack-poisoned]

awgu requested review from H-Huang, kwen2501, mingzhe09088, mrshenli, pritamdamania87, rohan-varma and zhaojuanmao as code owners October 18, 2022 22:00

pytorch-bot bot added release notes: distributed (fsdp) release notes category labels Oct 18, 2022

awgu pushed a commit that referenced this pull request Oct 18, 2022

[FSDP] Use reduce_scatter_tensor()

e4eb746

ghstack-source-id: 1b31017 Pull Request resolved: #87240

zhaojuanmao reviewed Oct 21, 2022

View reviewed changes

awgu added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 21, 2022

rohan-varma approved these changes Oct 24, 2022

View reviewed changes

Update on "[FSDP] Use reduce_scatter_tensor()"

fe79b20

Let us silence some more warnings 👍🏼 [ghstack-poisoned]

awgu pushed a commit that referenced this pull request Oct 24, 2022

[FSDP] Use reduce_scatter_tensor()

ec8464a

ghstack-source-id: 6238433 Pull Request resolved: #87240

pytorchmergebot added the Merged label Oct 24, 2022

pytorchmergebot closed this in 04ad013 Oct 24, 2022

awgu added topic: developer feature and removed topic: developer feature labels Oct 24, 2022

awgu added the topic: improvements topic category label Oct 24, 2022

kulinseth pushed a commit to kulinseth/pytorch that referenced this pull request Nov 5, 2022

[FSDP] Use reduce_scatter_tensor() (pytorch#87240)

e27ce21

Let us silence some more warnings 👍🏼 Pull Request resolved: pytorch#87240 Approved by: https://github.com/rohan-varma

kulinseth pushed a commit to kulinseth/pytorch that referenced this pull request Dec 10, 2022

[FSDP] Use reduce_scatter_tensor() (pytorch#87240)

bd1adb4

Let us silence some more warnings 👍🏼 Pull Request resolved: pytorch#87240 Approved by: https://github.com/rohan-varma

facebook-github-bot deleted the gh/awgu/133/head branch June 8, 2023 15:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FSDP] Use `reduce_scatter_tensor()` #87240

[FSDP] Use `reduce_scatter_tensor()` #87240

Uh oh!

awgu commented Oct 18, 2022 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 18, 2022 •

edited

Loading

Uh oh!

zhaojuanmao left a comment

Uh oh!

awgu commented Oct 21, 2022

Uh oh!

rohan-varma left a comment

Uh oh!

awgu commented Oct 24, 2022

Uh oh!

pytorchmergebot commented Oct 24, 2022

Uh oh!

pytorchmergebot commented Oct 24, 2022

Uh oh!

awgu commented Oct 24, 2022

Uh oh!

pytorchmergebot commented Oct 24, 2022

Uh oh!

github-actions bot commented Oct 24, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[FSDP] Use reduce_scatter_tensor() #87240

[FSDP] Use reduce_scatter_tensor() #87240

Uh oh!

Conversation

awgu commented Oct 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/87240

✅ No Failures

Uh oh!

zhaojuanmao left a comment

Choose a reason for hiding this comment

Uh oh!

awgu commented Oct 21, 2022

Uh oh!

rohan-varma left a comment

Choose a reason for hiding this comment

Uh oh!

awgu commented Oct 24, 2022

Uh oh!

pytorchmergebot commented Oct 24, 2022

Uh oh!

pytorchmergebot commented Oct 24, 2022

Uh oh!

awgu commented Oct 24, 2022

Uh oh!

pytorchmergebot commented Oct 24, 2022

Merge started

Uh oh!

github-actions bot commented Oct 24, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[FSDP] Use `reduce_scatter_tensor()` #87240

[FSDP] Use `reduce_scatter_tensor()` #87240

awgu commented Oct 18, 2022 •

edited

Loading

pytorch-bot bot commented Oct 18, 2022 •

edited

Loading