Skip to content

Conversation

@emcastillo
Copy link
Collaborator

Blocked by #86096

@emcastillo emcastillo added module: optimizer Related to torch.optim release notes: nn release notes category topic: improvements topic category labels Oct 4, 2022
@emcastillo emcastillo requested a review from albanD as a code owner October 4, 2022 03:03
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 4, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/86183

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures, 1 Pending

As of commit dc2140d:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Oct 4, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is item needed here in fact?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah .. addcdiv_ requires value to be a python scalar. I guess we could unify both execution paths to avoid device synchronization but not sure. cc @albanD (who also was concerned about this)

@vadimkantorov
Copy link
Contributor

vadimkantorov commented Oct 4, 2022

Maybe worth introducing an idiomatic utility clone_if(flag) lambda (clone_if = lambda tensor, flag: tensor.clone() if flag else tensor)? this seems a repeated idiom across optimizers, given keeping inplace ops (even if copied inside all optimizers) - although maybe better keep this verbose with if statements as it is now...

Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait is this included in the other PR? :p

@emcastillo emcastillo force-pushed the differentiable-adamvars branch from e30bc50 to 83fe75e Compare October 6, 2022 05:42
@emcastillo
Copy link
Collaborator Author

separated all the PRs!

Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me !

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 6, 2022
@emcastillo emcastillo force-pushed the differentiable-adamvars branch from 83fe75e to c19e5fc Compare October 12, 2022 08:27
@vadimkantorov
Copy link
Contributor

vadimkantorov commented Oct 12, 2022

Maybe worth introducing an idiomatic utility clone_if(flag) lambda (clone_if = lambda tensor, flag: tensor.clone() if flag else tensor)? this seems a repeated idiom across optimizers, given keeping inplace ops (even if copied inside all optimizers) - although maybe better keep this verbose with if statements as it is now...

Any thoughts about clone_if idiom? Or are explicit different if-statement paths better?

@albanD
Copy link
Collaborator

albanD commented Oct 12, 2022

Any thoughts about clone_if idiom? Or are explicit different if-statement paths better?

I'm not sure? t = t.clone() if flag or t seems pretty clear and t = t.clone_if(flag) is not necessarily more readable?
Also we already have too many methods, I'm not sure we want simple helpers like this added.

pytorchmergebot pushed a commit that referenced this pull request Oct 13, 2022
Blocked by #86183
Pull Request resolved: #86258
Approved by: https://github.com/albanD
@emcastillo emcastillo force-pushed the differentiable-adamvars branch 2 times, most recently from 1e4ec86 to 8484ad3 Compare October 13, 2022 05:04
@emcastillo
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: The following mandatory check(s) failed (Rule superuser):

Dig deeper by viewing the failures on hud

Details for Dev Infra team Raised by workflow job

@emcastillo emcastillo force-pushed the differentiable-adamvars branch from 8484ad3 to b1f8705 Compare October 13, 2022 08:30
@vadimkantorov
Copy link
Contributor

Any thoughts about clone_if idiom? Or are explicit different if-statement paths better?

I'm not sure? t = t.clone() if flag or t seems pretty clear and t = t.clone_if(flag) is not necessarily more readable? Also we already have too many methods, I'm not sure we want simple helpers like this added.

well, i meant that instead of if differentiable: if-else statements all over the optimizers that only change add_ to add etc. we would have a clone_if helper lambda shared across these optimizers and then the code would be clone_if(tensor, differentiable).add_ and keep only one code version. Since the lambda is trivial, it could be copied across the optimizer impls if wanting to keep their codes more independent

@vadimkantorov
Copy link
Contributor

t = t.clone() if flag or t seems pretty clear and t = t.clone_if(flag) is not necessarily more readable?
Yeah, agree, but I mean somehow using this as idioum, for that a lambda clone_if may be better

Then instead of
if differentiable:
tensor = tensor.add(eps)
else:
tensor = tensor.add_(eps)

could be always written:
tensor = clone_if(tensor, differentiable).add_(eps) - although maybe this is less efficient?

Also, is there a functorch higher-order function wrappers that can act somehow torch.add(tensor, eps, inplace = not differentiable) to choose between torch.add and torch.add_?

@albanD
Copy link
Collaborator

albanD commented Oct 13, 2022

although maybe this is less efficient?

Yes, we go twice over the memory, once to copy it and once to add.

Also, is there a functorch higher-order function wrappers that can act somehow torch.add(tensor, eps, inplace = not differentiable) to choose between torch.add and torch.add_?

Maybe? That would be a lot of work to add that to all APIs :p

@vadimkantorov
Copy link
Contributor

actually, maybe not so many, not that many ops support proper inplace...

@albanD
Copy link
Collaborator

albanD commented Oct 13, 2022

Also functorch really doesn't like inplace in general ;) So I don't think they will be happy with adding inplace kwarg haha
cc @zou3519

@vadimkantorov
Copy link
Contributor

Well, maybe not in functorch :) in core

@emcastillo emcastillo force-pushed the differentiable-adamvars branch 2 times, most recently from 0148172 to 8f160b2 Compare October 14, 2022 00:42
@albanD
Copy link
Collaborator

albanD commented Oct 14, 2022

You can skip the dynamo errors.
The other issues seem relevant right?

@emcastillo emcastillo force-pushed the differentiable-adamvars branch from 8f160b2 to 3ff8cd2 Compare October 16, 2022 23:21
@emcastillo
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request cla signed Merged module: optimizer Related to torch.optim open source release notes: nn release notes category topic: improvements topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants