[FSDP] Default to BACKWARD_PRE
#88428
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack:
limit_all_gathers=True#88432 [FSDP] Default tolimit_all_gathers=Truesharding_strategydocs and other minor doc changes #88431 [FSDP][Docs] Rewordsharding_strategydocs and other minor doc changesmixed_precisionctor docs #88429 [FSDP][Docs] Simplifymixed_precisionctor docsBACKWARD_PRE#88428 [FSDP] Default toBACKWARD_PREfully_shard()onlyFULL_SHARD#88260 [FSDP()][Easy] Makefully_shard()onlyFULL_SHARDfully_shard()abide by@contract! #88235 [FSDP()] Havefully_shard()abide by@contract!_Stateto_FSDPState#88234 [FSDP()][Easy] Rename_Stateto_FSDPStatefully_shard()and move to_composable/#88233 [FSDP()] Rename tofully_shard()and move to_composable/TrainingStatetransition #88232 [FSDP][Easy] Remove unneededTrainingStatetransitionunflat_param_name->fqnfor consistency #88123 [FSDP] Renameunflat_param_name->fqnfor consistency_get_buffer_names()#88122 [FSDP] Simplify_get_buffer_names()torch.no_grad()context when offloading to CPU #88121 [FSDP] Remove unneededtorch.no_grad()context when offloading to CPU_lazy_init()into_fsdp_root_pre_forward()#87941 [FSDP()][26/N] Move_lazy_init()into_fsdp_root_pre_forward()_post_forward_reshard()#87940 [FSDP()][25/N] Add_post_forward_reshard()_lazy_init()#87939 [FSDP()][24/N] Refactor_lazy_init()_cast_buffers()#87935 [FSDP()][21/N] Refactor and fix_cast_buffers()dtypetobuffer_name_to_dtype#87934 [FSDP] Renamedtypetobuffer_name_to_dtypedevicearg from_cast_buffers()#87933 [FSDP] Removedevicearg from_cast_buffers()pre_forward_unshard()#87931 [FSDP()][18/N] Refactorpre_forward_unshard()_fsdp_root_pre_forward()#87930 [FSDP()][17/N] Refactor_fsdp_root_pre_forward()_init_streams()#87928 [FSDP()][15/N] Refactor_init_streams()