Skip to content

Conversation

@awgu
Copy link
Collaborator

@awgu awgu commented Oct 27, 2022

Stack from ghstack:

  • This PR defines a new api.py meant to hold the public API for FSDP (minus FullyShardedDataParallel itself). This is needed because several of the _<...>_utils.py files rely on the public API, and we cannot import from torch.distributed.fsdp.fully_sharded_data_parallel without a circular import. Calling the file api.py follows the convention used by ShardedTensor.
  • This PR cleans up the wording in the BackwardPrefetch, ShardingStrategy, MixedPrecision, and CPUOffload docstrings.
  • This PR adds the aforementioned classes to fsdp.rst to have them rendered in public docs.
  • To abide by the public bindings contract (test_public_bindings.py), the aforementioned classes are removed from fully_sharded_data_parallel.py's __all__. This is technically BC breaking if someone uses from torch.distributed.fsdp.fully_sharded_data_parallel import *; however, that does not happen in any of our own external or internal code.

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 27, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/87917

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit e688429:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

This PR is easy. I moved `BackwardPrefetch` to a new file `common_utils.py` and reworded the docs a bit.

[ghstack-poisoned]
This PR is easy. I moved `BackwardPrefetch` to a new file `common_utils.py` and reworded the docs a bit.

[ghstack-poisoned]
This PR is easy. I moved `BackwardPrefetch` to a new file `common_utils.py` and reworded the docs a bit.

[ghstack-poisoned]
awgu pushed a commit to awgu/pytorch that referenced this pull request Oct 29, 2022
ghstack-source-id: 6dc3b46
Pull Request resolved: pytorch#87917
This PR is easy. I moved `BackwardPrefetch` to a new file `common_utils.py` and reworded the docs a bit.

[ghstack-poisoned]
@awgu awgu added the topic: bc breaking topic category label Oct 29, 2022
@awgu awgu changed the title [FSDP()][3/N] Refactor BackwardPrefetch enum [FSDP()][3/N] Refactor public APIs Oct 29, 2022
Andrew Gu added 4 commits October 29, 2022 21:35
- This PR defines a new `api.py` meant to hold the public API for FSDP (minus `FullyShardedDataParallel` itself). This is needed because several of the `_<...>_utils.py` files rely on the public API, and we cannot import from `torch.distributed.fsdp.fully_sharded_data_parallel` without a circular import. Calling the file `api.py` follows the convention used by `ShardedTensor`.
- This PR cleans up the wording in the `BackwardPrefetch`, `ShardingStrategy`, `MixedPrecision`, and `CPUOffload` docstrings.
- This PR adds the aforementioned classes to `fsdp.rst` to have them rendered in public docs.
- To abide by the public bindings contract (`test_public_bindings.py`), the aforementioned classes are removed from `fully_sharded_data_parallel.py`'s `__all__`. This is technically BC breaking if someone uses `from torch.distributed.fsdp.fully_sharded_data_parallel import *`; however, that does not happen in any of our own external or internal code.

[ghstack-poisoned]
- This PR defines a new `api.py` meant to hold the public API for FSDP (minus `FullyShardedDataParallel` itself). This is needed because several of the `_<...>_utils.py` files rely on the public API, and we cannot import from `torch.distributed.fsdp.fully_sharded_data_parallel` without a circular import. Calling the file `api.py` follows the convention used by `ShardedTensor`.
- This PR cleans up the wording in the `BackwardPrefetch`, `ShardingStrategy`, `MixedPrecision`, and `CPUOffload` docstrings.
- This PR adds the aforementioned classes to `fsdp.rst` to have them rendered in public docs.
- To abide by the public bindings contract (`test_public_bindings.py`), the aforementioned classes are removed from `fully_sharded_data_parallel.py`'s `__all__`. This is technically BC breaking if someone uses `from torch.distributed.fsdp.fully_sharded_data_parallel import *`; however, that does not happen in any of our own external or internal code.

[ghstack-poisoned]
- This PR defines a new `api.py` meant to hold the public API for FSDP (minus `FullyShardedDataParallel` itself). This is needed because several of the `_<...>_utils.py` files rely on the public API, and we cannot import from `torch.distributed.fsdp.fully_sharded_data_parallel` without a circular import. Calling the file `api.py` follows the convention used by `ShardedTensor`.
- This PR cleans up the wording in the `BackwardPrefetch`, `ShardingStrategy`, `MixedPrecision`, and `CPUOffload` docstrings.
- This PR adds the aforementioned classes to `fsdp.rst` to have them rendered in public docs.
- To abide by the public bindings contract (`test_public_bindings.py`), the aforementioned classes are removed from `fully_sharded_data_parallel.py`'s `__all__`. This is technically BC breaking if someone uses `from torch.distributed.fsdp.fully_sharded_data_parallel import *`; however, that does not happen in any of our own external or internal code.

[ghstack-poisoned]
- This PR defines a new `api.py` meant to hold the public API for FSDP (minus `FullyShardedDataParallel` itself). This is needed because several of the `_<...>_utils.py` files rely on the public API, and we cannot import from `torch.distributed.fsdp.fully_sharded_data_parallel` without a circular import. Calling the file `api.py` follows the convention used by `ShardedTensor`.
- This PR cleans up the wording in the `BackwardPrefetch`, `ShardingStrategy`, `MixedPrecision`, and `CPUOffload` docstrings.
- This PR adds the aforementioned classes to `fsdp.rst` to have them rendered in public docs.
- To abide by the public bindings contract (`test_public_bindings.py`), the aforementioned classes are removed from `fully_sharded_data_parallel.py`'s `__all__`. This is technically BC breaking if someone uses `from torch.distributed.fsdp.fully_sharded_data_parallel import *`; however, that does not happen in any of our own external or internal code.

[ghstack-poisoned]
awgu pushed a commit to awgu/pytorch that referenced this pull request Oct 30, 2022
ghstack-source-id: 0ba00bf
Pull Request resolved: pytorch#87917
kulinseth pushed a commit to kulinseth/pytorch that referenced this pull request Nov 5, 2022
- This PR defines a new `api.py` meant to hold the public API for FSDP (minus `FullyShardedDataParallel` itself). This is needed because several of the `_<...>_utils.py` files rely on the public API, and we cannot import from `torch.distributed.fsdp.fully_sharded_data_parallel` without a circular import. Calling the file `api.py` follows the convention used by `ShardedTensor`.
- This PR cleans up the wording in the `BackwardPrefetch`, `ShardingStrategy`, `MixedPrecision`, and `CPUOffload` docstrings.
- This PR adds the aforementioned classes to `fsdp.rst` to have them rendered in public docs.
- To abide by the public bindings contract (`test_public_bindings.py`), the aforementioned classes are removed from `fully_sharded_data_parallel.py`'s `__all__`. This is technically BC breaking if someone uses `from torch.distributed.fsdp.fully_sharded_data_parallel import *`; however, that does not happen in any of our own external or internal code.
Pull Request resolved: pytorch#87917
Approved by: https://github.com/mrshenli
kulinseth pushed a commit to kulinseth/pytorch that referenced this pull request Dec 10, 2022
- This PR defines a new `api.py` meant to hold the public API for FSDP (minus `FullyShardedDataParallel` itself). This is needed because several of the `_<...>_utils.py` files rely on the public API, and we cannot import from `torch.distributed.fsdp.fully_sharded_data_parallel` without a circular import. Calling the file `api.py` follows the convention used by `ShardedTensor`.
- This PR cleans up the wording in the `BackwardPrefetch`, `ShardingStrategy`, `MixedPrecision`, and `CPUOffload` docstrings.
- This PR adds the aforementioned classes to `fsdp.rst` to have them rendered in public docs.
- To abide by the public bindings contract (`test_public_bindings.py`), the aforementioned classes are removed from `fully_sharded_data_parallel.py`'s `__all__`. This is technically BC breaking if someone uses `from torch.distributed.fsdp.fully_sharded_data_parallel import *`; however, that does not happen in any of our own external or internal code.
Pull Request resolved: pytorch#87917
Approved by: https://github.com/mrshenli
@facebook-github-bot facebook-github-bot deleted the gh/awgu/147/head branch June 8, 2023 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request release notes: distributed (fsdp) release notes category topic: bc breaking topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants