[optim] Improve adadelta foreach, group tensors to maximize fast path #92048

janeyx99 · 2023-01-11T23:23:04Z

Old behavior would have adadelta foreach sending tensors to the slow path if they were not all the same dtype nor on the same device.

This PR adds grouping for adadelta optimizer so that it would run foreach in batches, allowing more users to benefit from foreach perf.

Of course, we should ensure that the new implementation works, so there are new tests to ensure this behavior is not broken.

pytorch-bot · 2023-01-11T23:23:08Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/92048

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 853066c:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

janeyx99 · 2023-01-12T16:32:35Z

.jenkins/pytorch/multigpu-test.sh

 time python test/run_test.py --verbose -i distributed/_shard/test_replicated_tensor
 # Other tests
 time python test/run_test.py --verbose -i test_cuda_primary_ctx
+time python test/run_test.py --verbose -i test_optim -- -k optimizers_with_varying_tensors


@pytorch/pytorch-dev-infra to make sure this is okay. The total time it would add to multigpu is about 13 seconds.

albanD

Looks good, just the seed needs to be change, the rest are nits

test/test_optim.py

albanD

SGTM

pytorch-bot · 2023-01-13T20:10:25Z

This PR has been accepted with the accept2ship label. Attempting to merge now.

@pytorchbot merge

pytorchmergebot · 2023-01-13T20:12:20Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…#91896) following up to #90865 and #92048 Pull Request resolved: #91896 Approved by: https://github.com/albanD

janeyx99 added ciflow/trunk Trigger trunk jobs on your pull request release notes: nn release notes category topic: performance topic category and removed ciflow/trunk Trigger trunk jobs on your pull request labels Jan 11, 2023

[optim] Improve adadelta foreach, group tensors to maximize fast path

86b9b51

janeyx99 force-pushed the group-adadelta-foreach branch from 949af47 to 86b9b51 Compare January 12, 2023 00:01

janeyx99 added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 12, 2023

put test on multigpu and explain why

15b64a5

janeyx99 added the ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR label Jan 12, 2023

janeyx99 commented Jan 12, 2023

View reviewed changes

janeyx99 marked this pull request as ready for review January 12, 2023 16:32

janeyx99 requested a review from albanD as a code owner January 12, 2023 16:32

janeyx99 added the accept2ship label Jan 13, 2023

albanD reviewed Jan 13, 2023

View reviewed changes

test/test_optim.py Show resolved Hide resolved

test/test_optim.py Outdated Show resolved Hide resolved

test/test_optim.py Outdated Show resolved Hide resolved

test/test_optim.py Outdated Show resolved Hide resolved

test/test_optim.py Show resolved Hide resolved

Address nits, add majority of foreach optim to test

c0ccc6b

janeyx99 commented Jan 13, 2023

View reviewed changes

test/test_optim.py Show resolved Hide resolved

oops the seed should not have been removed for model setting

853066c

albanD approved these changes Jan 13, 2023

View reviewed changes

pytorch-bot bot removed the accept2ship label Jan 13, 2023

janeyx99 mentioned this pull request Jan 13, 2023

[optim][adadelta] default to foreach when CUDA + differentiable=False #91896

Closed

pytorchmergebot added the Merged label Jan 14, 2023

pytorchmergebot closed this in 4af5939 Jan 14, 2023

pytorchmergebot pushed a commit that referenced this pull request Jan 14, 2023

[optim][adadelta] default to foreach when CUDA + differentiable=False (…

d376550

…#91896) following up to #90865 and #92048 Pull Request resolved: #91896 Approved by: https://github.com/albanD

github-actions bot deleted the group-adadelta-foreach branch July 15, 2024 01:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[optim] Improve adadelta foreach, group tensors to maximize fast path #92048

[optim] Improve adadelta foreach, group tensors to maximize fast path #92048

Uh oh!

janeyx99 commented Jan 11, 2023

Uh oh!

pytorch-bot bot commented Jan 11, 2023 •

edited

Loading

Uh oh!

janeyx99 Jan 12, 2023

Uh oh!

albanD left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

albanD left a comment

Uh oh!

pytorch-bot bot commented Jan 13, 2023

Uh oh!

pytorchmergebot commented Jan 13, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[optim] Improve adadelta foreach, group tensors to maximize fast path #92048

[optim] Improve adadelta foreach, group tensors to maximize fast path #92048

Uh oh!

Conversation

janeyx99 commented Jan 11, 2023

Uh oh!

pytorch-bot bot commented Jan 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/92048

✅ No Failures

Uh oh!

janeyx99 Jan 12, 2023

Choose a reason for hiding this comment

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

pytorch-bot bot commented Jan 13, 2023

Uh oh!

pytorchmergebot commented Jan 13, 2023

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pytorch-bot bot commented Jan 11, 2023 •

edited

Loading