Warns when TensorIterator would resize its output #42079

mruberry · 2020-07-26T10:36:37Z

This adds a helper to resize output to ATen/native/Resize.* and updates TensorIterator to use it. The helper throws a warning if a tensor with one or more elements needs to be resized. This warning indicates that these resizes will become an error in a future PyTorch release.

There are many functions in PyTorch that will resize their outputs and don't use TensorIterator. For example,

pytorch/aten/src/ATen/native/cuda/NaiveConvolutionTranspose2d.cu

Line 243 in 985fd97

output.resize_({batch_size, n_output_plane, output_height, output_width});

And these functions will need to be updated to use this helper, too. This PR avoids their inclusion since the work is separable, and this should let us focus on the function and its behavior in review. A TODO appears in the code to reflect this.

dr-ci · 2020-07-26T11:35:48Z

💊 CI failures summary and remediations

As of commit a4332dc (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 40 times.

ezyang · 2020-07-27T14:16:17Z

aten/src/ATen/native/Resize.cpp

+      "shape ", output.sizes(), ", which does not match the required ",
+      "output shape ", shape, ".",
+      "This behavior is deprecated, and in a future PyTorch release outputs ",
+      "will not be resized unless they have zero elements.");


I generally like to tell users how to fix warnings in the warning message. Wondering if we can give more specific guidance in the case when output.numel() == shape.numel(). In this case, we could tell the user to explicitly view their output so that the warning goes away.

ezyang · 2020-07-27T14:17:24Z

Is the test suite warnings clean after this change?

ezyang · 2020-07-27T14:23:01Z

It seems that this will warn even in the case where no reallocation would have taken place. When I look at the issue, it seems like the concern brought up by @vadimkantorov at #41027 (comment) isn't really addressed by this PR. I am also concerned about this use case: I have seen multiple end user reports where people are purposely overallocating buffers for their outputs and then specifying them as out= arguments. If someone is doing this to avoid fragmentation, how can they avoid this warning? There isn't really any good way to do so.

ngimel · 2020-07-27T16:22:33Z

@ezyang Are users doing it for inference only? It's vert tricky to use out in training (possible only via custom AutogradFunctions)
To avoid this warning, people would need to allocate their large buffer, and then resize_ it to 0 size. Then the warning will be avoided, and the out will be resize_d to use allocated storage. It may seem like a roundabout way of achieving what currently happens, but current behavior is confusing for smaller-than-needed out and discontiguous out.

mruberry · 2020-07-27T20:06:06Z

Is the test suite warnings clean after this change?

No! Tests aren't clean and we could/should fix them before taking a PR like this.

I have seen multiple end user reports where people are purposely overallocating buffers for their outputs and then specifying them as out= arguments. If someone is doing this to avoid fragmentation, how can they avoid this warning?

This is actually worse than you're making it seem since in the future we would prohibit this behavior! But...

To avoid this warning, people would need to allocate their large buffer, and then resize_ it to 0 size. Then the warning will be avoided, and the out will be resize_d to use allocated storage. It may seem like a roundabout way of achieving what currently happens, but current behavior is confusing for smaller-than-needed out and discontiguous out.

Another imperfect option is to take views of a larger tensor. But as @ngimel and I discussed offline this can be challenging in some cases.

I would like to let users better express performance optimizations like this, but since they're inherently niche power user activities I think it's OK to make those users jump through one more hoop to be explicit about the behavior they want.

ezyang · 2020-07-28T14:40:11Z

To avoid this warning, people would need to allocate their large buffer, and then resize_ it to 0 size. Then the warning will be avoided, and the out will be resize_d to use allocated storage

OK, this is a neat trick but I agree it works with the way things are today. We should probably document this in the warning, or provide an API for more explicitly "reserving" space in the storage without actually having it show up in the tensor (this is in line with std::vector::reserve so it seems like a reasonable api surface)

ezyang · 2020-07-28T14:41:00Z

No! Tests aren't clean and we could/should fix them before taking a PR like this.

It would be nice to know the order of magnitude of splash damage here, that would inform whether or not we should fix them before taking this PR.

mruberry · 2020-07-28T14:45:17Z

No! Tests aren't clean and we could/should fix them before taking a PR like this.

It would be nice to know the order of magnitude of splash damage here, that would inform whether or not we should fix them before taking this PR.

I'll just fix them.

…arn_resize

mruberry · 2020-07-30T02:17:05Z

@ezyang @ngimel I replaced the warning with an error (temporarily) to catch all instances of this behavior covered by our CI, and I have updated the tests as needed. There was actually a bug in quantized multiplication where the output size of the tensor was being inferred incorrectly since it didn't account for broadcasting. I fixed that issue, too.

Please take a look at the updated language and test fixes. The effect of this change on our own codebase, tests included, seems small.

facebook-github-bot

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-07-31T08:13:44Z

@mruberry merged this pull request in 2f840b1.

resize_output

15b23c7

Mike Ruberry added 7 commits July 26, 2020 07:10

removes error

8ca73cd

comment update

9b3371a

comment update

39645e9

comment update

1ea5e64

comment update

a6c46a3

updates per review

bb779d2

comment cleanup

224a7f7

mruberry changed the title ~~[WIP] Warns when TensorIterator would resize its output~~ Warns when TensorIterator would resize its output Jul 27, 2020

test improvements

9e825b6

mruberry requested review from ezyang, gchanan and ngimel July 27, 2020 00:40

ezyang reviewed Jul 27, 2020

View reviewed changes

Mike Ruberry added 4 commits July 29, 2020 02:30

Merge branch 'master' of https://github.com/pytorch/pytorch into ti_w…

a931dee

…arn_resize

ping ci for test validation

089ab92

test reping

44d742e

updates test type promotion

84d9a71

Changes error back to warning

a4332dc

mruberry requested a review from ezyang July 30, 2020 04:30

ezyang approved these changes Jul 30, 2020

View reviewed changes

ngimel approved these changes Jul 30, 2020

View reviewed changes

facebook-github-bot reviewed Jul 30, 2020

View reviewed changes

vkuzo mentioned this pull request Jul 30, 2020

Audit quantization functions to ensure proper argument sizes #42336

Open

facebook-github-bot closed this in 2f840b1 Jul 31, 2020

facebook-github-bot added the merged label Jul 31, 2020

mruberry deleted the ti_warn_resize branch August 21, 2020 19:36

heitorschueroff mentioned this pull request Sep 11, 2020

Enable lerp on half type; fix output memory format #43541

Closed

mruberry added the Merged label Oct 28, 2020

bdhirsh mentioned this pull request Mar 9, 2021

skip dispatch trip for CPU in resize_ #53575

Closed

AnirudhDagar mentioned this pull request Aug 7, 2021

Support torch.concat alias, add cat OpInfo & remove OpInfo test_out skips {cat, stack, hstack, vtack, dstack} #62560

Closed

9 tasks

Warns when TensorIterator would resize its output #42079

Warns when TensorIterator would resize its output #42079

Uh oh!

Conversation

mruberry commented Jul 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dr-ci bot commented Jul 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

Uh oh!

ezyang Jul 27, 2020

Choose a reason for hiding this comment

Uh oh!

ezyang commented Jul 27, 2020

Uh oh!

ezyang commented Jul 27, 2020

Uh oh!

ngimel commented Jul 27, 2020

Uh oh!

mruberry commented Jul 27, 2020

Uh oh!

ezyang commented Jul 28, 2020

Uh oh!

ezyang commented Jul 28, 2020

Uh oh!

mruberry commented Jul 28, 2020

Uh oh!

mruberry commented Jul 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jul 31, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mruberry commented Jul 26, 2020 •

edited

Loading

dr-ci bot commented Jul 26, 2020 •

edited

Loading

mruberry commented Jul 30, 2020 •

edited

Loading