[MPS] Fix tensor with non-zero storage offset graph gathering #91071

qqaatw · 2022-12-18T04:31:46Z

Previously, the "can slice" flag in Placeholder constructor in OperationUtils.mm is conditioned on whether the numbers of dimensions of base shape and view shape are the same. This doesn't consider the situation that a view tensor could be the base tensor's sliced and then unsqueezed version, resulting in different num of dims.

For example, if we want to stack y_mps and x_mps on the last dim:

t_mps = torch.tensor([1, 2, 3, 4], device="mps")
x_mps = t_mps[2:]  # [3, 4]
y_mps = t_mps[:2]  # [1, 2]

res_mps = torch.stack((y_mps, x_mps), dim=-1)

the kernel will unsqueeze both of them on the last dim and then concatenate them, which is equivalent to:

res_mps = torch.cat((y_mps.unsqueeze(-1), x_mps.unsqueeze(-1)), dim=-1)

x_mps.unsqueeze(-1) is an unsqueezed and contiguous tensor with a storage offset, this kind of tensors should be sliceable without cloning its storage.

Fixes #87856
Fixes #91065

cc @kulinseth @albanD @malfet @DenisVieriu97 @razarmehr @abhudev

pytorch-bot · 2022-12-18T04:31:48Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/91071

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 1a8786b:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

qqaatw · 2022-12-19T02:06:28Z

@pytorchbot label "module: mps"

malfet

This PR contains unrelated changes as well as not much of an explanation as to why it is necessary to make a copy of the tensor view.

I.e. it probably fixes the symptom, but not the fundamental underlying problem.

malfet · 2022-12-19T21:10:53Z

test/test_mps.py

                res_mps = x_mps <= y_mps
                res_cpu = x_cpu <= y_cpu
-            if operator == "<":
+            elif operator == "<":


This seems unrelated to change in question, is it?

Yes not much related, but I think this improvement is trivial and small enough to not cause a reading difficulty. I feel it would be more costly and unnecessary to improve this in a separate PR, wouldn't it?

aten/src/ATen/native/mps/OperationUtils.mm

Fixes backward pass for bilinear. Summary of changes: - bilinear op is able to produce **contiguous, non-view** tensors with a storage offset, such as: shape=`[1, 1, 1, 1]`, `storage_offset=12`. This seems a weird case, but it is valid, and for these type of tensors we wouldn't be able to gather/scatter since we look at the view flag (which is not set here). This change looks into `storage_offset` only rather than the is_view flag which is not being set - **reduction sum** must return a zeroed out output if passing an input with 0 elements (e.g a shape of (0, 5)). Pull Request resolved: pytorch#94892 Approved by: https://github.com/kulinseth

qqaatw · 2023-02-16T11:10:06Z

@pytorchbot label "ciflow/trunk"

qqaatw · 2023-02-16T11:56:00Z

@pytorchbot label "keep-going"

kulinseth · 2023-02-16T19:12:51Z

aten/src/ATen/native/mps/OperationUtils.mm

    if (!mpsShape) {
      mpsShape = getMPSShape(_tensor);
-  }
+    }


This is purposely an indentation fix for if (!mpsShape) { block.

kulinseth · 2023-02-16T19:13:32Z

Modulo the nit, changes look good.

qqaatw · 2023-02-17T03:57:26Z

@pytorchbot merge

pytorchmergebot · 2023-02-17T03:59:10Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-02-17T03:59:15Z

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / macos-12-py3-x86-64 / test (default, 1, 3, macos-12)

Details for Dev Infra team

Raised by workflow job

qqaatw · 2023-02-17T18:42:25Z

@pytorchbot merge

pytorchmergebot · 2023-02-17T18:44:15Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…h#91071) Previously, the "can slice" flag in Placeholder constructor in `OperationUtils.mm` is conditioned on whether the numbers of dimensions of base shape and view shape are the same. This doesn't consider the situation that a view tensor could be the base tensor's sliced and then unsqueezed version, resulting in different num of dims. For example, if we want to stack `y_mps` and `x_mps` on the last dim: ``` t_mps = torch.tensor([1, 2, 3, 4], device="mps") x_mps = t_mps[2:] # [3, 4] y_mps = t_mps[:2] # [1, 2] res_mps = torch.stack((y_mps, x_mps), dim=-1) ``` the kernel will unsqueeze both of them on the last dim and then concatenate them, which is equivalent to: ``` res_mps = torch.cat((y_mps.unsqueeze(-1), x_mps.unsqueeze(-1)), dim=-1) ``` `x_mps.unsqueeze(-1)` is an unsqueezed and contiguous tensor with a storage offset, this kind of tensors should be sliceable without cloning its storage. Fixes pytorch#87856 Fixes pytorch#91065 Pull Request resolved: pytorch#91071 Approved by: https://github.com/kulinseth

* [MPS] Fix the uint8 type issue with View ops kernels (#95145) This should fix the problem in Resnet model with image artifacts due to saturation on int8 type and also the incorrect class recognition reported in #86954. Fixes #86954 Pull Request resolved: #95145 Approved by: https://github.com/kulinseth, https://github.com/DenisVieriu97 * [MPS] Fix tensor with non-zero storage offset graph gathering (#91071) Previously, the "can slice" flag in Placeholder constructor in `OperationUtils.mm` is conditioned on whether the numbers of dimensions of base shape and view shape are the same. This doesn't consider the situation that a view tensor could be the base tensor's sliced and then unsqueezed version, resulting in different num of dims. For example, if we want to stack `y_mps` and `x_mps` on the last dim: ``` t_mps = torch.tensor([1, 2, 3, 4], device="mps") x_mps = t_mps[2:] # [3, 4] y_mps = t_mps[:2] # [1, 2] res_mps = torch.stack((y_mps, x_mps), dim=-1) ``` the kernel will unsqueeze both of them on the last dim and then concatenate them, which is equivalent to: ``` res_mps = torch.cat((y_mps.unsqueeze(-1), x_mps.unsqueeze(-1)), dim=-1) ``` `x_mps.unsqueeze(-1)` is an unsqueezed and contiguous tensor with a storage offset, this kind of tensors should be sliceable without cloning its storage. Fixes #87856 Fixes #91065 Pull Request resolved: #91071 Approved by: https://github.com/kulinseth * [MPS] Fix fill_ where input tensor has a storage offset (#95113) Fixes #94390 Apart from fixing the issue above, this PR also fixes a bug that when an input tensor can be sliced, a sliced array view is created. This array view seems to be not writable or have a different storage from the original tensor, causing incorrect results with the in-place `fill`. Pull Request resolved: #95113 Approved by: https://github.com/kulinseth * [MPS] Fix view op slicing for 2nd dim in case of 0 offset (#95381) * Fix view op slicing for 2nd dim in case of 0 offset Pull Request resolved: #95381 Approved by: https://github.com/razarmehr --------- Co-authored-by: Ramin Azarmehr <razarmehr@apple.com> Co-authored-by: Li-Huai (Allan) Lin <qqaatw@gmail.com> Co-authored-by: Denis Vieriu <104024078+DenisVieriu97@users.noreply.github.com>

…pytorch#91071)" This reverts commit 0a9c608.

* [MPS] Fix the uint8 type issue with View ops kernels (pytorch#95145) This should fix the problem in Resnet model with image artifacts due to saturation on int8 type and also the incorrect class recognition reported in pytorch#86954. Fixes pytorch#86954 Pull Request resolved: pytorch#95145 Approved by: https://github.com/kulinseth, https://github.com/DenisVieriu97 * [MPS] Fix tensor with non-zero storage offset graph gathering (pytorch#91071) Previously, the "can slice" flag in Placeholder constructor in `OperationUtils.mm` is conditioned on whether the numbers of dimensions of base shape and view shape are the same. This doesn't consider the situation that a view tensor could be the base tensor's sliced and then unsqueezed version, resulting in different num of dims. For example, if we want to stack `y_mps` and `x_mps` on the last dim: ``` t_mps = torch.tensor([1, 2, 3, 4], device="mps") x_mps = t_mps[2:] # [3, 4] y_mps = t_mps[:2] # [1, 2] res_mps = torch.stack((y_mps, x_mps), dim=-1) ``` the kernel will unsqueeze both of them on the last dim and then concatenate them, which is equivalent to: ``` res_mps = torch.cat((y_mps.unsqueeze(-1), x_mps.unsqueeze(-1)), dim=-1) ``` `x_mps.unsqueeze(-1)` is an unsqueezed and contiguous tensor with a storage offset, this kind of tensors should be sliceable without cloning its storage. Fixes pytorch#87856 Fixes pytorch#91065 Pull Request resolved: pytorch#91071 Approved by: https://github.com/kulinseth * [MPS] Fix fill_ where input tensor has a storage offset (pytorch#95113) Fixes pytorch#94390 Apart from fixing the issue above, this PR also fixes a bug that when an input tensor can be sliced, a sliced array view is created. This array view seems to be not writable or have a different storage from the original tensor, causing incorrect results with the in-place `fill`. Pull Request resolved: pytorch#95113 Approved by: https://github.com/kulinseth * [MPS] Fix view op slicing for 2nd dim in case of 0 offset (pytorch#95381) * Fix view op slicing for 2nd dim in case of 0 offset Pull Request resolved: pytorch#95381 Approved by: https://github.com/razarmehr --------- Co-authored-by: Ramin Azarmehr <razarmehr@apple.com> Co-authored-by: Li-Huai (Allan) Lin <qqaatw@gmail.com> Co-authored-by: Denis Vieriu <104024078+DenisVieriu97@users.noreply.github.com>

…h#91071) Previously, the "can slice" flag in Placeholder constructor in `OperationUtils.mm` is conditioned on whether the numbers of dimensions of base shape and view shape are the same. This doesn't consider the situation that a view tensor could be the base tensor's sliced and then unsqueezed version, resulting in different num of dims. For example, if we want to stack `y_mps` and `x_mps` on the last dim: ``` t_mps = torch.tensor([1, 2, 3, 4], device="mps") x_mps = t_mps[2:] # [3, 4] y_mps = t_mps[:2] # [1, 2] res_mps = torch.stack((y_mps, x_mps), dim=-1) ``` the kernel will unsqueeze both of them on the last dim and then concatenate them, which is equivalent to: ``` res_mps = torch.cat((y_mps.unsqueeze(-1), x_mps.unsqueeze(-1)), dim=-1) ``` `x_mps.unsqueeze(-1)` is an unsqueezed and contiguous tensor with a storage offset, this kind of tensors should be sliceable without cloning its storage. Fixes pytorch#87856 Fixes pytorch#91065 Pull Request resolved: pytorch#91071 Approved by: https://github.com/kulinseth

qqaatw requested a review from kulinseth as a code owner December 18, 2022 04:31

pytorch-bot bot added ciflow/mps Run MPS tests (subset of trunk) release notes: mps Release notes category labels Dec 18, 2022

qqaatw changed the title ~~[MPS] Fix tensor with storage offset graph gathering~~ [MPS] Fix tensor with non-zero storage offset graph gathering Dec 18, 2022

pytorchbot added the open source label Dec 18, 2022

pytorch-bot bot added the module: mps Related to Apple Metal Performance Shaders framework label Dec 19, 2022

qqaatw mentioned this pull request Dec 19, 2022

[MPS] Fall back multi-layer LSTM on macOS 12 #90909

Closed

qqaatw force-pushed the fix_view_gather_logic branch from ffd155b to f64db8c Compare December 19, 2022 11:28

malfet requested changes Dec 19, 2022

View reviewed changes

ngimel added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Dec 21, 2022

qqaatw mentioned this pull request Jan 24, 2023

Permute followed by torch.nn.functional.interpolate gives wrong results on mps backend #88183

Closed

qqaatw and others added 7 commits February 15, 2023 01:36

[MPS] Fix tensor with storage offset gathering

5ee1cdf

Add test

a315edd

wip

2a256a0

wip

67da3bc

cleanup

bc0a361

Fix

70fb582

qqaatw force-pushed the fix_view_gather_logic branch from f64db8c to 70fb582 Compare February 16, 2023 10:40

Fix lint

9fd858d

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 16, 2023

pytorch-bot bot added the keep-going Don't stop on first failure, keep running tests until the end label Feb 16, 2023

qqaatw added 3 commits February 16, 2023 20:31

Merge branch 'master' into fix_view_gather_logic

3b54de0

wip

4d58773

Clean up

1a8786b

kulinseth reviewed Feb 16, 2023

View reviewed changes

kulinseth approved these changes Feb 16, 2023

View reviewed changes

pytorchmergebot added the Merged label Feb 17, 2023

pytorchmergebot closed this in 0a9c608 Feb 17, 2023

This was referenced Feb 22, 2023

[MPS] View fixes #95323

Merged

[v.2.0.0] Release Tracker #94937

Closed

msaroufim mentioned this pull request Mar 3, 2023

Remove mention of dynamo.optimize() in docs #96002

Closed

pruthvistony added a commit to ROCm/pytorch that referenced this pull request May 2, 2023

Revert "[MPS] Fix tensor with non-zero storage offset graph gathering (…

9ce0583

…pytorch#91071)" This reverts commit 0a9c608.

[MPS] Fix tensor with non-zero storage offset graph gathering #91071

[MPS] Fix tensor with non-zero storage offset graph gathering #91071

Uh oh!

Conversation

qqaatw commented Dec 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Dec 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/91071

✅ No Failures

Uh oh!

qqaatw commented Dec 19, 2022

Uh oh!

malfet left a comment

Choose a reason for hiding this comment

Uh oh!

malfet Dec 19, 2022

Choose a reason for hiding this comment

Uh oh!

qqaatw Dec 19, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

qqaatw commented Feb 16, 2023

Uh oh!

qqaatw commented Feb 16, 2023

Uh oh!

kulinseth Feb 16, 2023

Choose a reason for hiding this comment

Uh oh!

qqaatw Feb 17, 2023

Choose a reason for hiding this comment

Uh oh!

kulinseth commented Feb 16, 2023

Uh oh!

qqaatw commented Feb 17, 2023

Uh oh!

pytorchmergebot commented Feb 17, 2023

Merge started

Uh oh!

pytorchmergebot commented Feb 17, 2023

Merge failed

Uh oh!

qqaatw commented Feb 17, 2023

Uh oh!

pytorchmergebot commented Feb 17, 2023

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

qqaatw commented Dec 18, 2022 •

edited

Loading

pytorch-bot bot commented Dec 18, 2022 •

edited

Loading