-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[FSDP][2/N] Remove params_with_grad
#87480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/87480
Note: Links to docs will display an error until the docs builds have been completed. ✅ No Failures, 3 PendingAs of commit d69725e: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
params_with_gradparams_with_grad
This PR removes the property `params_with_grad` from `FullyShardedDataParallel`. It was introduced when implementing `clip_grad_norm_()` but was not consistently used. Personally, I do not think it makes sense for `FullyShardedDataParallel` to expose this helper because it is not a common paradigm. This PR is technically BC-breaking. However, I checked that no one internally is using this API. [ghstack-poisoned]
This PR removes the property `params_with_grad` from `FullyShardedDataParallel`. It was introduced when implementing `clip_grad_norm_()` but was not consistently used. Personally, I do not think it makes sense for `FullyShardedDataParallel` to expose this helper because it is not a common paradigm. This PR is technically BC-breaking. However, I checked that no one internally is using this API. [ghstack-poisoned]
rohan-varma
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good please add BC breaking label for release tracking purposes.
This PR removes the property `params_with_grad` from `FullyShardedDataParallel`. It was introduced when implementing `clip_grad_norm_()` but was not consistently used. Personally, I do not think it makes sense for `FullyShardedDataParallel` to expose this helper because it is not a common paradigm. This PR is technically BC-breaking. However, I checked that no one internally is using this API. cc @ezyang @gchanan [ghstack-poisoned]
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
This PR removes the property `params_with_grad` from `FullyShardedDataParallel`. It was introduced when implementing `clip_grad_norm_()` but was not consistently used. Personally, I do not think it makes sense for `FullyShardedDataParallel` to expose this helper because it is not a common paradigm. This PR is technically BC-breaking. However, I checked that no one internally is using this API. cc @ezyang @gchanan Pull Request resolved: pytorch#87480 Approved by: https://github.com/rohan-varma
This PR removes the property `params_with_grad` from `FullyShardedDataParallel`. It was introduced when implementing `clip_grad_norm_()` but was not consistently used. Personally, I do not think it makes sense for `FullyShardedDataParallel` to expose this helper because it is not a common paradigm. This PR is technically BC-breaking. However, I checked that no one internally is using this API. cc @ezyang @gchanan Pull Request resolved: pytorch#87480 Approved by: https://github.com/rohan-varma
Stack from ghstack:
params_with_grad#87480 [FSDP][2/N] Removeparams_with_gradclip_grad_norm_()and tests #87479 [FSDP][1/N] Reworkclip_grad_norm_()and testsThis PR removes the property
params_with_gradfromFullyShardedDataParallel. It was introduced when implementingclip_grad_norm_()but was not consistently used. Personally, I do not think it makes sense forFullyShardedDataParallelto expose this helper because it is not a common paradigm.This PR is technically BC-breaking. However, I checked that no one internally is using this API.
cc @ezyang @gchanan