Skip to content

[DeviceMesh] Make _flatten_mapping an object attribute instead of a class attribute#165521

Closed
fduwjj wants to merge 1 commit intogh/fduwjj/225/basefrom
gh/fduwjj/225/head
Closed

[DeviceMesh] Make _flatten_mapping an object attribute instead of a class attribute#165521
fduwjj wants to merge 1 commit intogh/fduwjj/225/basefrom
gh/fduwjj/225/head

Conversation

@fduwjj
Copy link
Contributor

@fduwjj fduwjj commented Oct 15, 2025

Stack from ghstack (oldest at bottom):

The _flatten_mapping field was defined as a class attribute with a mutable default value {}:

_flatten_mapping: dict[str, "DeviceMesh"] = {}

This caused all DeviceMesh instances to share the same dictionary object. When multiple test instances tried to create flattened meshes with the same name (like "dp"), they would conflict because they were all using the same shared dictionary, resulting in the error: "Flatten mesh with mesh_dim_name dp has been created before, Please specify another valid mesh_dim_name."

cc @H-Huang @awgu @wanchaol @fegin @wz337 @wconstab @d4l3k @pragupta @msaroufim @dcci

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 15, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/165521

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 1 Pending

As of commit 15a9869 with merge base e6f766c (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

fduwjj added a commit that referenced this pull request Oct 15, 2025
…lass attribute

ghstack-source-id: 16d1431
Pull Request resolved: #165521
@pytorch-bot pytorch-bot bot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Oct 15, 2025
@fduwjj fduwjj requested review from fegin and lw October 15, 2025 05:21
@fduwjj fduwjj added release notes: DeviceMesh ciflow/trunk Trigger trunk jobs on your pull request labels Oct 15, 2025
@fduwjj
Copy link
Contributor Author

fduwjj commented Oct 15, 2025

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Chao1Han pushed a commit to Chao1Han/pytorch that referenced this pull request Oct 21, 2025
…lass attribute (pytorch#165521)

The `_flatten_mapping` field was defined as a class attribute with a mutable default value {}:
```
_flatten_mapping: dict[str, "DeviceMesh"] = {}
```
This caused all DeviceMesh instances to share the same dictionary object. When multiple test instances tried to create flattened meshes with the same name (like "dp"), they would conflict because they were all using the same shared dictionary, resulting in the error: "Flatten mesh with mesh_dim_name dp has been created before, Please specify another valid mesh_dim_name."

Pull Request resolved: pytorch#165521
Approved by: https://github.com/fegin, https://github.com/lw
@github-actions github-actions bot deleted the gh/fduwjj/225/head branch November 15, 2025 02:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged oncall: distributed Add this issue/PR to distributed oncall triage queue release notes: DeviceMesh

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants