-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[FSDP][1/N] Split fully_shard unit tests
#92296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/92296
Note: Links to docs will display an error until the docs builds have been completed. ❌ 4 FailuresAs of commit c5f0d89: NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
ghstack-source-id: 535f26d Pull Request resolved: pytorch#92296
This PR splits `test_fully_shard.py` into `fully_shard/test_fully_shard<...>.py`. This should help improve readability and avoid some future rebase conflicts. The only other real change is resolving a `TODO` for using `run_subtests` in the model checkpointing unit tests. [ghstack-poisoned]
This PR splits `test_fully_shard.py` into `fully_shard/test_fully_shard<...>.py`. This should help improve readability and avoid some future rebase conflicts. The only other real change is resolving a `TODO` for using `run_subtests` in the model checkpointing unit tests. [ghstack-poisoned]
This PR splits `test_fully_shard.py` into `fully_shard/test_fully_shard<...>.py`. This should help improve readability and avoid some future rebase conflicts. The only other real change is resolving a `TODO` for using `run_subtests` in the model checkpointing unit tests. [ghstack-poisoned]
| E2E test of save + load with rank0_only + CPU offload for TransformerWithSharedParams | ||
| on the composable path. | ||
| """ | ||
| self.run_subtests( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only real change, where I knocked out a to-do to use run_subtests.
mrshenli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 2 mandatory check(s) failed (Rule
Dig deeper by viewing the failures on hud Details for Dev Infra teamRaised by workflow job |
|
Failures look unrelated: linux-bionic-py3_7-clang8-xla / test (xla, 1, 1, linux.4xlarge) linux-focal-py3.7-gcc7 / test (default, 1, 2, linux.2xlarge) linux-focal-rocm5.3-py3.8 / test (default, 1, 2, linux.rocm.gpu) |
|
@pytorchbot merge -f "unrelated failures" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
ghstack-source-id: 4224de3 Pull Request resolved: pytorch#92296
Stack from ghstack:
summon_full_paramsunit tests #92298 [FSDP][3/N] Refactorsummon_full_paramsunit tests_summon_full_params->_unshard_params#92297 [FSDP][2/N]_summon_full_params->_unshard_paramsfully_shardunit tests #92296 [FSDP][1/N] Splitfully_shardunit testsThis PR splits
test_fully_shard.pyintofully_shard/test_fully_shard<...>.py. This should help improve readability and avoid some future rebase conflicts.The only other real change is resolving a
TODOfor usingrun_subtestsin the model checkpointing unit tests.