Include lr scaling info in adamw weight_decay docstring #154113

thomas-woehrle · 2025-05-22T11:44:04Z

Adds documentation as discussed in #38853

pytorch-bot · 2025-05-22T11:44:09Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/154113

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 21a5219 with merge base d7a83ab ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

janeyx99

I have two concerns, but maybe it's okay.

While I like having some mention of this over none at all, I'm not sure the doc reader will necessarily know what this means from just the phrase. It may be better to include a bigger note:: and include the mathematical difference like mentioned in the issue.
We should mention this in all the *AdamWs, NAdam + RAdam as they have decoupled_weight_decay options.

thomas-woehrle · 2025-05-23T07:40:49Z

While I like having some mention of this over none at all, I'm not sure the doc reader will necessarily know what this means from just the phrase. It may be better to include a bigger note:: and include the mathematical difference like mentioned in the issue.

One could link to the comment of the original implementation issue, here. It explains the "problem" nicely. Would this be preferable over a description in the docstring?

We should mention this in all the *AdamWs, NAdam + RAdam as they have decoupled_weight_decay options.

I could add the note to all of them. This would make it even more desirable to link to an existing explanation compared to explaining the issue in detail in the docstring itself.

Also, if #22343 would be implemented things would/could change again, just mentioning that

janeyx99 · 2025-06-27T20:39:24Z

#22343 will likely not be implemented any time soon.

A description in the docstr is still preferred (as a digestible description) and I'm not sure we typically link to issues in docs (though I'm not theoretically opposed). You can define a docstr in optimizer.py and import it in all the relevant files for deduplication.

github-actions · 2025-08-26T21:34:09Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

Include lr scaling info in adamw weight_decay docstring

21a5219

thomas-woehrle requested review from albanD and janeyx99 as code owners May 22, 2025 11:44

pytorch-bot bot added the release notes: optim label May 22, 2025

pytorchbot added the open source label May 22, 2025

albanD removed their request for review May 22, 2025 17:43

albanD added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 22, 2025

janeyx99 reviewed May 22, 2025

View reviewed changes

github-actions bot added the Stale label Aug 26, 2025

github-actions bot closed this Sep 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Include lr scaling info in adamw weight_decay docstring #154113

Include lr scaling info in adamw weight_decay docstring #154113

Uh oh!

thomas-woehrle commented May 22, 2025

Uh oh!

pytorch-bot bot commented May 22, 2025 •

edited

Loading

Uh oh!

janeyx99 left a comment

Uh oh!

thomas-woehrle commented May 23, 2025

Uh oh!

janeyx99 commented Jun 27, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Aug 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Include lr scaling info in adamw weight_decay docstring #154113

Include lr scaling info in adamw weight_decay docstring #154113

Uh oh!

Conversation

thomas-woehrle commented May 22, 2025

Uh oh!

pytorch-bot bot commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/154113

✅ No Failures

Uh oh!

janeyx99 left a comment

Choose a reason for hiding this comment

Uh oh!

thomas-woehrle commented May 23, 2025

Uh oh!

janeyx99 commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pytorch-bot bot commented May 22, 2025 •

edited

Loading

janeyx99 commented Jun 27, 2025 •

edited

Loading