Skip to content

[MPS] Fix conv layout handling#162776

Closed
malfet wants to merge 18 commits intogh/malfet/516/basefrom
gh/malfet/516/head
Closed

[MPS] Fix conv layout handling#162776
malfet wants to merge 18 commits intogh/malfet/516/basefrom
gh/malfet/516/head

Conversation

@malfet
Copy link
Contributor

@malfet malfet commented Sep 12, 2025

Stack from ghstack (oldest at bottom):

What started as simple fix for mps_convolution_backward_input resulted in a pretty significant refactor/fixes:

  • Updated mps_conv_use_channels_last to return channels last output if either input or weights are channels last
  • Use the same primitive throughout Convolution.mm to determine wether output should be allocated in channels last format or not

But doing only those two, resulted in crash in test_memory_format_nn_Conv2d_mps_float32, when weights were backward, and bias is present:

% python -c "import torch;print(torch.nn.functional.conv2d(torch.rand(2, 4, 3, 4,device='mps'), torch.rand(5, 4, 3, 3,device='mps').to(memory_format=torch.channels_last), torch.rand(5,device='mps')))"
/AppleInternal/Library/BuildRoots/4~B5E4ugDCh2RsPWAjMEoPu8LC5w1yXEwd7XweDhg/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphExecutable.mm:3619: failed assertion `Error: MLIR pass manager failed'
zsh: abort      python -c 

Which requires a more thorough redesign/cleanup, namely:

  • Do not alter the layout based on MacOS version, but rather do additional copies on MacOS-14 if inputs/output or weight are in channels last format ( done by defining std::optional<Tensor> output_c; that contains a contiguous copy of the output tensor
  • Introduced input_suggested_layout which is set to ChannelsLast if and only if input is channels last and is running on MacOS-15+
  • Delete unused memory_layout and group arguments from fill_depthwise_conv_desc
  • Fix bias broadcasting logic for channels last

As result, in addition to adding one more regression test this change removes expectedFailures from:

  • TestModule.test_memory_format for Conv2d, ConvTranspose2d, LazyConv1d, LazyConvTranspose1d
  • test_require_stride_expanded_dynamic_shapes
  • test_mutable_custom_op_fixed_layout2 for MacOS-14

Fixes #161905

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @jerryzh168 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben

[ghstack-poisoned]
@malfet malfet requested a review from kulinseth as a code owner September 12, 2025 00:08
@pytorch-bot
Copy link

pytorch-bot bot commented Sep 12, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/162776

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

⏳ No Failures, 38 Pending

As of commit 337d8ee with merge base eb3fbf5 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added ciflow/mps Run MPS tests (subset of trunk) release notes: mps Release notes category labels Sep 12, 2025
malfet added a commit that referenced this pull request Sep 12, 2025
Simply allocate output in channels last form if either input or gradient
is channels last

Fixes #161905

ghstack-source-id: b191a9f
Pull Request resolved: #162776
[ghstack-poisoned]
[ghstack-poisoned]
@pytorch-bot pytorch-bot bot added the module: cpu CPU specific problem (e.g., perf, algorithm) label Sep 16, 2025
[ghstack-poisoned]
[ghstack-poisoned]
@malfet malfet changed the title [MPS] Fix conv backwards for channel last [MPS] Fix conv backwards layout handling Sep 16, 2025
malfet and others added 3 commits September 22, 2025 06:17
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
malfet added a commit that referenced this pull request Sep 22, 2025
Simply allocate output in channels last form if either input or gradient
is channels last

Fixes #161905

ghstack-source-id: 9b3908b
Pull Request resolved: #162776
[ghstack-poisoned]
malfet added a commit that referenced this pull request Sep 22, 2025
Simply allocate output in channels last form if either input or gradient
is channels last

Fixes #161905

ghstack-source-id: df1e15f
Pull Request resolved: #162776
[ghstack-poisoned]
malfet added a commit that referenced this pull request Sep 22, 2025
Simply allocate output in channels last form if either input or gradient
is channels last

Fixes #161905

ghstack-source-id: c56ff5d
Pull Request resolved: #162776
[ghstack-poisoned]
malfet added a commit that referenced this pull request Sep 24, 2025
Simply allocate output in channels last form if either input or gradient
is channels last

Fixes #161905

ghstack-source-id: 45dcd30
Pull Request resolved: #162776
[ghstack-poisoned]
malfet added a commit that referenced this pull request Sep 24, 2025
Simply allocate output in channels last form if either input or gradient
is channels last

Fixes #161905

ghstack-source-id: 828fcfd
Pull Request resolved: #162776
[ghstack-poisoned]
malfet added a commit that referenced this pull request Sep 24, 2025
Simply allocate output in channels last form if either input or gradient
is channels last

Fixes #161905

ghstack-source-id: 063eca2
Pull Request resolved: #162776
[ghstack-poisoned]
malfet added a commit that referenced this pull request Sep 24, 2025
Simply allocate output in channels last form if either input or gradient
is channels last

Fixes #161905

ghstack-source-id: 60efb8d
Pull Request resolved: #162776
@malfet malfet changed the title [MPS] Fix conv backwards layout handling [MPS] Fix conv layout handling Sep 24, 2025
@malfet malfet added the topic: bug fixes topic category label Sep 24, 2025
[ghstack-poisoned]
malfet added a commit that referenced this pull request Sep 24, 2025
Simply allocate output in channels last form if either input or gradient
is channels last

Fixes #161905

ghstack-source-id: 539a885
Pull Request resolved: #162776
[ghstack-poisoned]
malfet added a commit that referenced this pull request Sep 24, 2025
Simply allocate output in channels last form if either input or gradient
is channels last

Fixes #161905

ghstack-source-id: e151e4a
Pull Request resolved: #162776
[ghstack-poisoned]
malfet added a commit that referenced this pull request Sep 25, 2025
Simply allocate output in channels last form if either input or gradient
is channels last

Fixes #161905

ghstack-source-id: f6e498d
Pull Request resolved: #162776
[ghstack-poisoned]
malfet added a commit that referenced this pull request Sep 25, 2025
Simply allocate output in channels last form if either input or gradient
is channels last

Fixes #161905

ghstack-source-id: d170c07
Pull Request resolved: #162776
malfet added a commit that referenced this pull request Sep 25, 2025
Simply allocate output in channels last form if either input or gradient
is channels last

Fixes #161905

ghstack-source-id: d170c07
Pull Request resolved: #162776
@malfet
Copy link
Contributor Author

malfet commented Sep 25, 2025

@pytorchbot merge -f "Lint + MPS tests are green"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

jainapurva pushed a commit that referenced this pull request Sep 29, 2025
What started as simple fix for `mps_convolution_backward_input` resulted in a pretty significant refactor/fixes:
- Updated `mps_conv_use_channels_last` to return channels last output if either input or weights are channels last
- Use the same primitive throughout `Convolution.mm` to determine wether output should be allocated in channels last format or not

But doing only those two, resulted in crash in `test_memory_format_nn_Conv2d_mps_float32`, when weights were backward, and bias is present:
```
% python -c "import torch;print(torch.nn.functional.conv2d(torch.rand(2, 4, 3, 4,device='mps'), torch.rand(5, 4, 3, 3,device='mps').to(memory_format=torch.channels_last), torch.rand(5,device='mps')))"
/AppleInternal/Library/BuildRoots/4~B5E4ugDCh2RsPWAjMEoPu8LC5w1yXEwd7XweDhg/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphExecutable.mm:3619: failed assertion `Error: MLIR pass manager failed'
zsh: abort      python -c
```

Which requires a more thorough redesign/cleanup, namely:
- Do not alter the layout based on MacOS version, but rather do additional copies on MacOS-14 if inputs/output or weight are in channels last format ( done by defining `std::optional<Tensor> output_c;` that contains a contiguous copy of the output tensor
- Introduced `input_suggested_layout` which is set to ChannelsLast if and only if input is channels last and is running on MacOS-15+
- Delete unused `memory_layout` and `group` arguments from `fill_depthwise_conv_desc`
- Fix bias broadcasting logic for channels last

As result, in addition to adding one more regression test this change removes `expectedFailures` from:
- `TestModule.test_memory_format` for `Conv2d`, `ConvTranspose2d`, `LazyConv1d`, `LazyConvTranspose1d`
- `test_require_stride_expanded_dynamic_shapes`
-  `test_mutable_custom_op_fixed_layout2` for MacOS-14

Fixes #161905

Pull Request resolved: #162776
Approved by: https://github.com/Skylion007
@github-actions github-actions bot deleted the gh/malfet/516/head branch October 26, 2025 02:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/mps Run MPS tests (subset of trunk) Merged module: cpu CPU specific problem (e.g., perf, algorithm) module: inductor release notes: mps Release notes category topic: bug fixes topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants