-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Make ASGD & RProp differentiable
#86258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/86258
Note: Links to docs will display an error until the docs builds have been completed. ✅ No Failures, 5 PendingAs of commit 85c9b52: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
|
albanD
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work!
torch/optim/adamax.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to update the passed in exp_inf for this to work properly no?
exp_inf.copy_(torch.amax(norm_buf, 0, keepdim=False))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!
torch/optim/nadam.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could you append: "by updating the grad and exp_avg directly and not using the scalar "value" argument of addcdiv.
btw on GPU, it might be better to do this as the item() would cause a sync cc @ptrblck if someone on your end has time to check this out?
|
(you might want to rebase on top of master for CI to run properly) |
ec8759e to
1217410
Compare
|
I added the comments in each of the PR and removed the dependences |
albanD
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM !
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: This PR is too stale; the last push date was more than 3 days ago. Please rebase and try again. Details for Dev Infra teamRaised by workflow job |
1217410 to
8679373
Compare
8679373 to
85c9b52
Compare
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Blocked by #86183