-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[FSDP] Register state_dict hooks for FlatParamsWrapper even if params_list is empty #74860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…_list is empty These pre/post hooks must be registered even if the FlatParamsWrapper does not flatten any parameters; the FlatParamsWrapper may contains some FSDP-wrapped submodules. Differential Revision: [D35194483](https://our.internmc.facebook.com/intern/diff/D35194483/) [ghstack-poisoned]
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 8864624 (more details on the Dr. CI page):
🕵️ 3 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
| Job | Step | Action |
|---|---|---|
| Test | 🔁 rerun |
This comment was automatically generated by Dr. CI (expand for details).
Please report bugs/suggestions to the (internal) Dr. CI Users group.
…n if params_list is empty" These pre/post hooks must be registered even if the FlatParamsWrapper does not flatten any parameters; the FlatParamsWrapper may contains some FSDP-wrapped submodules. Differential Revision: [D35194483](https://our.internmc.facebook.com/intern/diff/D35194483/) [ghstack-poisoned]
…_list is empty Pull Request resolved: #74860 These pre/post hooks must be registered even if the FlatParamsWrapper does not flatten any parameters; any submodule inside FlatParamsWrapper should be pre/post processed by the hooks. ghstack-source-id: 152390556 Differential Revision: [D35194483](https://our.internmc.facebook.com/intern/diff/D35194483/)
rohan-varma
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but can we add the test described in #74810?
…n if params_list is empty" These pre/post hooks must be registered even if the FlatParamsWrapper does not flatten any parameters; the FlatParamsWrapper may contains some FSDP-wrapped submodules. Differential Revision: [D35194483](https://our.internmc.facebook.com/intern/diff/D35194483/) [ghstack-poisoned]
…_list is empty Pull Request resolved: #74860 These pre/post hooks must be registered even if the FlatParamsWrapper does not flatten any parameters; any submodule inside FlatParamsWrapper should be pre/post processed by the hooks. ghstack-source-id: 152526345 Differential Revision: [D35194483](https://our.internmc.facebook.com/intern/diff/D35194483/)
| setattr(module, LINEAR_SKIP, linear_skip) | ||
| return fsdp, linear_skip_tensor_names | ||
|
|
||
| fsdp, linear_skip_tensor_names = _create_module() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
linear_skip_tensor_names is unused. I think it is for checking unused parameter in checkpoint, can you add a TODO here?
| loss = fsdp(inp) | ||
| loss.sum().backward() | ||
|
|
||
| state_dict = fsdp.state_dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the overall goal to resolve the issue is to ensure that this can be loaded into a local module, I can try this though and add a follow up change. Or feel free to do so if you have the time.
…n if params_list is empty" These pre/post hooks must be registered even if the FlatParamsWrapper does not flatten any parameters; the FlatParamsWrapper may contains some FSDP-wrapped submodules. Differential Revision: [D35194483](https://our.internmc.facebook.com/intern/diff/D35194483/) [ghstack-poisoned]
…n if params_list is empty" These pre/post hooks must be registered even if the FlatParamsWrapper does not flatten any parameters; the FlatParamsWrapper may contains some FSDP-wrapped submodules. Differential Revision: [D35194483](https://our.internmc.facebook.com/intern/diff/D35194483/) [ghstack-poisoned]
…_list is empty Pull Request resolved: #74860 These pre/post hooks must be registered even if the FlatParamsWrapper does not flatten any parameters; any submodule inside FlatParamsWrapper should be pre/post processed by the hooks. ghstack-source-id: 152557128 Differential Revision: [D35194483](https://our.internmc.facebook.com/intern/diff/D35194483/)
…n if params_list is empty" These pre/post hooks must be registered even if the FlatParamsWrapper does not flatten any parameters; the FlatParamsWrapper may contains some FSDP-wrapped submodules. Differential Revision: [D35194483](https://our.internmc.facebook.com/intern/diff/D35194483/) [ghstack-poisoned]
…n if params_list is empty" These pre/post hooks must be registered even if the FlatParamsWrapper does not flatten any parameters; the FlatParamsWrapper may contains some FSDP-wrapped submodules. Differential Revision: [D35194483](https://our.internmc.facebook.com/intern/diff/D35194483/) [ghstack-poisoned]
…_list is empty Pull Request resolved: #74860 These pre/post hooks must be registered even if the FlatParamsWrapper does not flatten any parameters; any submodule inside FlatParamsWrapper should be pre/post processed by the hooks. ghstack-source-id: 152594052 Differential Revision: [D35194483](https://our.internmc.facebook.com/intern/diff/D35194483/)
…_list is empty (#74860) Summary: Pull Request resolved: #74860 These pre/post hooks must be registered even if the FlatParamsWrapper does not flatten any parameters; any submodule inside FlatParamsWrapper should be pre/post processed by the hooks. ghstack-source-id: 152594052 Test Plan: CI Reviewed By: rohan-varma Differential Revision: D35194483 fbshipit-source-id: c25d7846f317c7ce78d77d335d041fed8db8f3a1
|
Hey @fegin. |
Stack from ghstack (oldest at bottom):
These pre/post hooks must be registered even if the FlatParamsWrapper does not flatten any parameters; the FlatParamsWrapper may contains some FSDP-wrapped submodules.
Differential Revision: D35194483