Skip to content

[DTensor] Fix Conv behavior for replicate stategy#167402

Closed
malfet wants to merge 3 commits intogh/malfet/593/basefrom
gh/malfet/593/head
Closed

[DTensor] Fix Conv behavior for replicate stategy#167402
malfet wants to merge 3 commits intogh/malfet/593/basefrom
gh/malfet/593/head

Conversation

@malfet
Copy link
Contributor

@malfet malfet commented Nov 8, 2025

Stack from ghstack (oldest at bottom):

Pass dim_map to _requires_data_exchange and return False if both spatial and channels dimensions are replicated

Modify test_conv1d and test_conv3d to check values rather than just shape, and replicate conv3d across batch dimension

In general, feels like current Convolution implementation was written to work only if tensor is sharded across last dimention

cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta @msaroufim @dcci

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Nov 8, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/167402

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 81ecf4f with merge base 694592a (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

malfet added a commit that referenced this pull request Nov 8, 2025
Pass `dim_map` to `_requires_data_exchange` and return False if both spatial and channels dimentions are replicated

Modify `test_conv1d` and `test_conv3d` to check values rather than just shape, and replicate `conv3d` across batch dimention


ghstack-source-id: 9da79ec
Pull-Request: #167402
@pytorch-bot pytorch-bot bot added ciflow/inductor oncall: distributed Add this issue/PR to distributed oncall triage queue labels Nov 8, 2025
@malfet malfet added topic: bug fixes topic category release notes: distributed (dtensor) release notes category labels Nov 8, 2025
[ghstack-poisoned]
malfet added a commit that referenced this pull request Nov 8, 2025
Pass `dim_map` to `_requires_data_exchange` and return False if both spatial and channels dimentions are replicated

Modify `test_conv1d` and `test_conv3d` to check values rather than just shape, and replicate `conv3d` across batch dimention

ghstack-source-id: d22d612
Pull-Request: #167402
[ghstack-poisoned]
malfet added a commit that referenced this pull request Nov 8, 2025
Pass `dim_map` to `_requires_data_exchange` and return False if both spatial and channels dimentions are replicated

Modify `test_conv1d` and `test_conv3d` to check values rather than just shape, and replicate `conv3d` across batch dimention

ghstack-source-id: 02f80e8
Pull-Request: #167402
@malfet malfet added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 8, 2025
@malfet
Copy link
Contributor Author

malfet commented Nov 10, 2025

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Silv3S pushed a commit to Silv3S/pytorch that referenced this pull request Nov 18, 2025
Pass `dim_map` to `_requires_data_exchange` and return False if both spatial and channels dimensions are replicated

Modify `test_conv1d` and `test_conv3d` to check values rather than just shape, and replicate `conv3d` across batch dimension

In general, feels like current Convolution implementation was written to work only if tensor is sharded across last dimention

Pull Request resolved: pytorch#167402
Approved by: https://github.com/ezyang
@github-actions github-actions bot deleted the gh/malfet/593/head branch December 11, 2025 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged oncall: distributed Add this issue/PR to distributed oncall triage queue release notes: distributed (dtensor) release notes category topic: bug fixes topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants