Skip to content

Latest commit

 

History

History
24 lines (15 loc) · 803 Bytes

File metadata and controls

24 lines (15 loc) · 803 Bytes

Model Checkpointing

DeepSpeed provides routines for checkpointing model state during training.

Loading Training Checkpoints

.. autofunction:: deepspeed.DeepSpeedEngine.load_checkpoint

Saving Training Checkpoints

.. autofunction:: deepspeed.DeepSpeedEngine.save_checkpoint


ZeRO Checkpoint fp32 Weights Recovery

DeepSpeed provides routines for extracting fp32 weights from the saved ZeRO checkpoint's optimizer states.

.. autofunction:: deepspeed.utils.zero_to_fp32.get_fp32_state_dict_from_zero_checkpoint

.. autofunction:: deepspeed.utils.zero_to_fp32.load_state_dict_from_zero_checkpoint

.. autofunction:: deepspeed.utils.zero_to_fp32.convert_zero_checkpoint_to_fp32_state_dict