-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[MPS] Fix tensor with non-zero storage offset graph gathering #91071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/91071
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 1a8786b: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot label "module: mps" |
ffd155b to
f64db8c
Compare
malfet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR contains unrelated changes as well as not much of an explanation as to why it is necessary to make a copy of the tensor view.
I.e. it probably fixes the symptom, but not the fundamental underlying problem.
| res_mps = x_mps <= y_mps | ||
| res_cpu = x_cpu <= y_cpu | ||
| if operator == "<": | ||
| elif operator == "<": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unrelated to change in question, is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes not much related, but I think this improvement is trivial and small enough to not cause a reading difficulty. I feel it would be more costly and unnecessary to improve this in a separate PR, wouldn't it?
Fixes backward pass for bilinear. Summary of changes: - bilinear op is able to produce **contiguous, non-view** tensors with a storage offset, such as: shape=`[1, 1, 1, 1]`, `storage_offset=12`. This seems a weird case, but it is valid, and for these type of tensors we wouldn't be able to gather/scatter since we look at the view flag (which is not set here). This change looks into `storage_offset` only rather than the is_view flag which is not being set - **reduction sum** must return a zeroed out output if passing an input with 0 elements (e.g a shape of (0, 5)). Pull Request resolved: pytorch#94892 Approved by: https://github.com/kulinseth
f64db8c to
70fb582
Compare
|
@pytorchbot label "ciflow/trunk" |
|
@pytorchbot label "keep-going" |
| if (!mpsShape) { | ||
| mpsShape = getMPSShape(_tensor); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is purposely an indentation fix for if (!mpsShape) { block.
|
Modulo the nit, changes look good. |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / macos-12-py3-x86-64 / test (default, 1, 3, macos-12) Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
…h#91071) Previously, the "can slice" flag in Placeholder constructor in `OperationUtils.mm` is conditioned on whether the numbers of dimensions of base shape and view shape are the same. This doesn't consider the situation that a view tensor could be the base tensor's sliced and then unsqueezed version, resulting in different num of dims. For example, if we want to stack `y_mps` and `x_mps` on the last dim: ``` t_mps = torch.tensor([1, 2, 3, 4], device="mps") x_mps = t_mps[2:] # [3, 4] y_mps = t_mps[:2] # [1, 2] res_mps = torch.stack((y_mps, x_mps), dim=-1) ``` the kernel will unsqueeze both of them on the last dim and then concatenate them, which is equivalent to: ``` res_mps = torch.cat((y_mps.unsqueeze(-1), x_mps.unsqueeze(-1)), dim=-1) ``` `x_mps.unsqueeze(-1)` is an unsqueezed and contiguous tensor with a storage offset, this kind of tensors should be sliceable without cloning its storage. Fixes pytorch#87856 Fixes pytorch#91065 Pull Request resolved: pytorch#91071 Approved by: https://github.com/kulinseth
…h#91071) Previously, the "can slice" flag in Placeholder constructor in `OperationUtils.mm` is conditioned on whether the numbers of dimensions of base shape and view shape are the same. This doesn't consider the situation that a view tensor could be the base tensor's sliced and then unsqueezed version, resulting in different num of dims. For example, if we want to stack `y_mps` and `x_mps` on the last dim: ``` t_mps = torch.tensor([1, 2, 3, 4], device="mps") x_mps = t_mps[2:] # [3, 4] y_mps = t_mps[:2] # [1, 2] res_mps = torch.stack((y_mps, x_mps), dim=-1) ``` the kernel will unsqueeze both of them on the last dim and then concatenate them, which is equivalent to: ``` res_mps = torch.cat((y_mps.unsqueeze(-1), x_mps.unsqueeze(-1)), dim=-1) ``` `x_mps.unsqueeze(-1)` is an unsqueezed and contiguous tensor with a storage offset, this kind of tensors should be sliceable without cloning its storage. Fixes pytorch#87856 Fixes pytorch#91065 Pull Request resolved: pytorch#91071 Approved by: https://github.com/kulinseth
…h#91071) Previously, the "can slice" flag in Placeholder constructor in `OperationUtils.mm` is conditioned on whether the numbers of dimensions of base shape and view shape are the same. This doesn't consider the situation that a view tensor could be the base tensor's sliced and then unsqueezed version, resulting in different num of dims. For example, if we want to stack `y_mps` and `x_mps` on the last dim: ``` t_mps = torch.tensor([1, 2, 3, 4], device="mps") x_mps = t_mps[2:] # [3, 4] y_mps = t_mps[:2] # [1, 2] res_mps = torch.stack((y_mps, x_mps), dim=-1) ``` the kernel will unsqueeze both of them on the last dim and then concatenate them, which is equivalent to: ``` res_mps = torch.cat((y_mps.unsqueeze(-1), x_mps.unsqueeze(-1)), dim=-1) ``` `x_mps.unsqueeze(-1)` is an unsqueezed and contiguous tensor with a storage offset, this kind of tensors should be sliceable without cloning its storage. Fixes pytorch#87856 Fixes pytorch#91065 Pull Request resolved: pytorch#91071 Approved by: https://github.com/kulinseth
* [MPS] Fix the uint8 type issue with View ops kernels (#95145) This should fix the problem in Resnet model with image artifacts due to saturation on int8 type and also the incorrect class recognition reported in #86954. Fixes #86954 Pull Request resolved: #95145 Approved by: https://github.com/kulinseth, https://github.com/DenisVieriu97 * [MPS] Fix tensor with non-zero storage offset graph gathering (#91071) Previously, the "can slice" flag in Placeholder constructor in `OperationUtils.mm` is conditioned on whether the numbers of dimensions of base shape and view shape are the same. This doesn't consider the situation that a view tensor could be the base tensor's sliced and then unsqueezed version, resulting in different num of dims. For example, if we want to stack `y_mps` and `x_mps` on the last dim: ``` t_mps = torch.tensor([1, 2, 3, 4], device="mps") x_mps = t_mps[2:] # [3, 4] y_mps = t_mps[:2] # [1, 2] res_mps = torch.stack((y_mps, x_mps), dim=-1) ``` the kernel will unsqueeze both of them on the last dim and then concatenate them, which is equivalent to: ``` res_mps = torch.cat((y_mps.unsqueeze(-1), x_mps.unsqueeze(-1)), dim=-1) ``` `x_mps.unsqueeze(-1)` is an unsqueezed and contiguous tensor with a storage offset, this kind of tensors should be sliceable without cloning its storage. Fixes #87856 Fixes #91065 Pull Request resolved: #91071 Approved by: https://github.com/kulinseth * [MPS] Fix fill_ where input tensor has a storage offset (#95113) Fixes #94390 Apart from fixing the issue above, this PR also fixes a bug that when an input tensor can be sliced, a sliced array view is created. This array view seems to be not writable or have a different storage from the original tensor, causing incorrect results with the in-place `fill`. Pull Request resolved: #95113 Approved by: https://github.com/kulinseth * [MPS] Fix view op slicing for 2nd dim in case of 0 offset (#95381) * Fix view op slicing for 2nd dim in case of 0 offset Pull Request resolved: #95381 Approved by: https://github.com/razarmehr --------- Co-authored-by: Ramin Azarmehr <razarmehr@apple.com> Co-authored-by: Li-Huai (Allan) Lin <qqaatw@gmail.com> Co-authored-by: Denis Vieriu <104024078+DenisVieriu97@users.noreply.github.com>
…pytorch#91071)" This reverts commit 0a9c608.
* [MPS] Fix the uint8 type issue with View ops kernels (pytorch#95145) This should fix the problem in Resnet model with image artifacts due to saturation on int8 type and also the incorrect class recognition reported in pytorch#86954. Fixes pytorch#86954 Pull Request resolved: pytorch#95145 Approved by: https://github.com/kulinseth, https://github.com/DenisVieriu97 * [MPS] Fix tensor with non-zero storage offset graph gathering (pytorch#91071) Previously, the "can slice" flag in Placeholder constructor in `OperationUtils.mm` is conditioned on whether the numbers of dimensions of base shape and view shape are the same. This doesn't consider the situation that a view tensor could be the base tensor's sliced and then unsqueezed version, resulting in different num of dims. For example, if we want to stack `y_mps` and `x_mps` on the last dim: ``` t_mps = torch.tensor([1, 2, 3, 4], device="mps") x_mps = t_mps[2:] # [3, 4] y_mps = t_mps[:2] # [1, 2] res_mps = torch.stack((y_mps, x_mps), dim=-1) ``` the kernel will unsqueeze both of them on the last dim and then concatenate them, which is equivalent to: ``` res_mps = torch.cat((y_mps.unsqueeze(-1), x_mps.unsqueeze(-1)), dim=-1) ``` `x_mps.unsqueeze(-1)` is an unsqueezed and contiguous tensor with a storage offset, this kind of tensors should be sliceable without cloning its storage. Fixes pytorch#87856 Fixes pytorch#91065 Pull Request resolved: pytorch#91071 Approved by: https://github.com/kulinseth * [MPS] Fix fill_ where input tensor has a storage offset (pytorch#95113) Fixes pytorch#94390 Apart from fixing the issue above, this PR also fixes a bug that when an input tensor can be sliced, a sliced array view is created. This array view seems to be not writable or have a different storage from the original tensor, causing incorrect results with the in-place `fill`. Pull Request resolved: pytorch#95113 Approved by: https://github.com/kulinseth * [MPS] Fix view op slicing for 2nd dim in case of 0 offset (pytorch#95381) * Fix view op slicing for 2nd dim in case of 0 offset Pull Request resolved: pytorch#95381 Approved by: https://github.com/razarmehr --------- Co-authored-by: Ramin Azarmehr <razarmehr@apple.com> Co-authored-by: Li-Huai (Allan) Lin <qqaatw@gmail.com> Co-authored-by: Denis Vieriu <104024078+DenisVieriu97@users.noreply.github.com>
…h#91071) Previously, the "can slice" flag in Placeholder constructor in `OperationUtils.mm` is conditioned on whether the numbers of dimensions of base shape and view shape are the same. This doesn't consider the situation that a view tensor could be the base tensor's sliced and then unsqueezed version, resulting in different num of dims. For example, if we want to stack `y_mps` and `x_mps` on the last dim: ``` t_mps = torch.tensor([1, 2, 3, 4], device="mps") x_mps = t_mps[2:] # [3, 4] y_mps = t_mps[:2] # [1, 2] res_mps = torch.stack((y_mps, x_mps), dim=-1) ``` the kernel will unsqueeze both of them on the last dim and then concatenate them, which is equivalent to: ``` res_mps = torch.cat((y_mps.unsqueeze(-1), x_mps.unsqueeze(-1)), dim=-1) ``` `x_mps.unsqueeze(-1)` is an unsqueezed and contiguous tensor with a storage offset, this kind of tensors should be sliceable without cloning its storage. Fixes pytorch#87856 Fixes pytorch#91065 Pull Request resolved: pytorch#91071 Approved by: https://github.com/kulinseth
Previously, the "can slice" flag in Placeholder constructor in
OperationUtils.mmis conditioned on whether the numbers of dimensions of base shape and view shape are the same. This doesn't consider the situation that a view tensor could be the base tensor's sliced and then unsqueezed version, resulting in different num of dims.For example, if we want to stack
y_mpsandx_mpson the last dim:the kernel will unsqueeze both of them on the last dim and then concatenate them, which is equivalent to:
x_mps.unsqueeze(-1)is an unsqueezed and contiguous tensor with a storage offset, this kind of tensors should be sliceable without cloning its storage.Fixes #87856
Fixes #91065
cc @kulinseth @albanD @malfet @DenisVieriu97 @razarmehr @abhudev