-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Dynamo asserts FSDP wrapped modules use_orig_param #89523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- This is a strict requirement given the way dynamo+FSDP is implemented, but isn't convenient to assert. - By plumbing use_orig_param field on all wrapped modules, we can do this assertion inside dynamo [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/89523
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 FailuresAs of commit 098429e: The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
| # for this, since Dynamo skips all the FSDP code frames and thus can't inspect the | ||
| # FSDP module directly | ||
| submodule._fsdp_use_orig_params = use_orig_params | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This FSDP change looks fine to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. the rest of the change is pretty trivial. I think the biggest risk is that we are cluttering up the submodules this way. the dynamo change is not a risk, and the unit test works.
- This is a strict requirement given the way dynamo+FSDP is implemented, but isn't convenient to assert. - By plumbing use_orig_param field on all wrapped modules, we can do this assertion inside dynamo cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
- This is a strict requirement given the way dynamo+FSDP is implemented, but isn't convenient to assert. - By plumbing use_orig_param field on all wrapped modules, we can do this assertion inside dynamo cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: This PR is too stale; the last push date was more than 3 days ago. Please rebase and try again. You can rebase by leaving the following comment on this PR: Details for Dev Infra teamRaised by workflow job |
- This is a strict requirement given the way dynamo+FSDP is implemented, but isn't convenient to assert. - By plumbing use_orig_param field on all wrapped modules, we can do this assertion inside dynamo cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: The following mandatory check(s) failed (Rule Dig deeper by viewing the failures on hud Details for Dev Infra teamRaised by workflow job |
- This is a strict requirement given the way dynamo+FSDP is implemented, but isn't convenient to assert. - By plumbing use_orig_param field on all wrapped modules, we can do this assertion inside dynamo cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 additional jobs have failed, first few of them are: trunk Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot merge -f "Flaky CI" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: Command Details for Dev Infra teamRaised by workflow job |
- This is a strict requirement given the way dynamo+FSDP is implemented, but isn't convenient to assert. - By plumbing use_orig_param field on all wrapped modules, we can do this assertion inside dynamo cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
|
@pytorchbot merge -f "Flaky CI" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
After #89523, we now need to assert use_orig_params=True, even in the non-recursive case where (I think) we wouldn't otherwise need to run with use_orig_params=True. Tested with `python benchmarks/dynamo/torchbench.py --training --accuracy --only hf_T5 --fsdp` [ghstack-poisoned]
After #89523, we now need to assert use_orig_params=True, even in the non-recursive case where (I think) we wouldn't otherwise need to run with use_orig_params=True. Tested with `python benchmarks/dynamo/torchbench.py --training --accuracy --only hf_T5 --fsdp` ghstack-source-id: ace270d Pull Request resolved: #90100
…rig_params=True" After #89523, we now need to assert use_orig_params=True, even in the non-recursive case where (I think) we wouldn't otherwise need to run with use_orig_params=True. Tested with `python benchmarks/dynamo/torchbench.py --training --accuracy --only hf_T5 --fsdp` cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
After #89523, we now need to assert use_orig_params=True, even in the non-recursive case where (I think) we wouldn't otherwise need to run with use_orig_params=True. Tested with `python benchmarks/dynamo/torchbench.py --training --accuracy --only hf_T5 --fsdp` ghstack-source-id: 938e65b Pull Request resolved: #90100
After #89523, we now need to assert use_orig_params=True, even in the non-recursive case where (I think) we wouldn't otherwise need to run with use_orig_params=True. Tested with `python benchmarks/dynamo/torchbench.py --training --accuracy --only hf_T5 --fsdp` cc mlazos soumith voznesenskym yanboliang penguinwu anijain2305 EikanWang jgong5 Guobing-Chen chunyuan-w XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx desertfire [ghstack-poisoned]
After #89523, we now need to assert use_orig_params=True, even in the non-recursive case where (I think) we wouldn't otherwise need to run with use_orig_params=True. Tested with `python benchmarks/dynamo/torchbench.py --training --accuracy --only hf_T5 --fsdp` Pull Request resolved: #90100 Approved by: https://github.com/wconstab
- This is a strict requirement given the way dynamo+FSDP is implemented, but isn't convenient to assert. - By plumbing use_orig_param field on all wrapped modules, we can do this assertion inside dynamo Pull Request resolved: pytorch#89523 Approved by: https://github.com/awgu
) After pytorch#89523, we now need to assert use_orig_params=True, even in the non-recursive case where (I think) we wouldn't otherwise need to run with use_orig_params=True. Tested with `python benchmarks/dynamo/torchbench.py --training --accuracy --only hf_T5 --fsdp` Pull Request resolved: pytorch#90100 Approved by: https://github.com/wconstab
Stack from ghstack (oldest at bottom):
but isn't convenient to assert.
do this assertion inside dynamo
cc @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @desertfire