-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[FSDP][optim_state_dict][10/N] Make optim_state_dict and optim_state_dict_to_load public #92118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…dict_to_load public Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/92118
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 5dd0379: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
…dict_to_load public Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/) ghstack-source-id: 177584342 Pull Request resolved: #92118
…ptim_state_dict_to_load public" Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/) [ghstack-poisoned]
…dict_to_load public Pull Request resolved: #92118 Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. ghstack-source-id: 177753434 Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/)
…ptim_state_dict_to_load public" Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/) [ghstack-poisoned]
…dict_to_load public Pull Request resolved: #92118 Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. ghstack-source-id: 177918763 Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/)
…ptim_state_dict_to_load public" Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/) [ghstack-poisoned]
…ptim_state_dict_to_load public" Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/) [ghstack-poisoned]
…dict_to_load public Pull Request resolved: #92118 Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. ghstack-source-id: 177990998 Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/)
…ptim_state_dict_to_load public" Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/) [ghstack-poisoned]
…dict_to_load public Pull Request resolved: #92118 Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. ghstack-source-id: 178293084 Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/)
…ptim_state_dict_to_load public" Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/) [ghstack-poisoned]
…ptim_state_dict_to_load public" Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/) [ghstack-poisoned]
…ptim_state_dict_to_load public" Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/) [ghstack-poisoned]
…ptim_state_dict_to_load public" Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/) [ghstack-poisoned]
…ptim_state_dict_to_load public" Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/) [ghstack-poisoned]
…dict_to_load public Pull Request resolved: #92118 Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. ghstack-source-id: 178328384 Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/)
rohan-varma
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
|
|
||
| @dataclass | ||
| class LocalOptimStateDictConfig(OptimStateDictConfig): | ||
| offload_to_cpu: bool = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is the default overriden to false for local, but not sharded?
…ptim_state_dict_to_load public" Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/) [ghstack-poisoned]
…dict_to_load public Pull Request resolved: #92118 Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. ghstack-source-id: 178572798 Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/)
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: This PR is too stale; the last push date was more than 3 days ago. Please rebase and try again. You can rebase by leaving the following comment on this PR: Details for Dev Infra teamRaised by workflow job |
…ptim_state_dict_to_load public" Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/) [ghstack-poisoned]
…ptim_state_dict_to_load public" Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/) [ghstack-poisoned]
…dict_to_load public Pull Request resolved: #92118 Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. ghstack-source-id: 178828472 Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/)
|
@pytorchbot merge -f "The failing test is not related." |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: Command Details for Dev Infra teamRaised by workflow job |
…ptim_state_dict_to_load public" Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/) [ghstack-poisoned]
…dict_to_load public Pull Request resolved: #92118 Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load. ghstack-source-id: 178869730 Differential Revision: [D42488022](https://our.internmc.facebook.com/intern/diff/D42488022/)
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Stack from ghstack (oldest at bottom):
_is_named_optimizerwhen the state is empty #93303Make optim_state_dict and optim_state_dict_to_load public APIs and consolidate them with state_dict by using the same state_dict_type to decide how to perform the optimizer state_dict save and load.
Differential Revision: D42488022