-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Warns when TensorIterator would resize its output #42079
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💊 CI failures summary and remediationsAs of commit a4332dc (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 40 times. |
aten/src/ATen/native/Resize.cpp
Outdated
| "shape ", output.sizes(), ", which does not match the required ", | ||
| "output shape ", shape, ".", | ||
| "This behavior is deprecated, and in a future PyTorch release outputs ", | ||
| "will not be resized unless they have zero elements."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally like to tell users how to fix warnings in the warning message. Wondering if we can give more specific guidance in the case when output.numel() == shape.numel(). In this case, we could tell the user to explicitly view their output so that the warning goes away.
|
Is the test suite warnings clean after this change? |
|
It seems that this will warn even in the case where no reallocation would have taken place. When I look at the issue, it seems like the concern brought up by @vadimkantorov at #41027 (comment) isn't really addressed by this PR. I am also concerned about this use case: I have seen multiple end user reports where people are purposely overallocating buffers for their outputs and then specifying them as |
|
@ezyang Are users doing it for inference only? It's vert tricky to use |
No! Tests aren't clean and we could/should fix them before taking a PR like this.
This is actually worse than you're making it seem since in the future we would prohibit this behavior! But...
Another imperfect option is to take views of a larger tensor. But as @ngimel and I discussed offline this can be challenging in some cases. I would like to let users better express performance optimizations like this, but since they're inherently niche power user activities I think it's OK to make those users jump through one more hoop to be explicit about the behavior they want. |
OK, this is a neat trick but I agree it works with the way things are today. We should probably document this in the warning, or provide an API for more explicitly "reserving" space in the storage without actually having it show up in the tensor (this is in line with std::vector::reserve so it seems like a reasonable api surface) |
It would be nice to know the order of magnitude of splash damage here, that would inform whether or not we should fix them before taking this PR. |
I'll just fix them. |
|
@ezyang @ngimel I replaced the warning with an error (temporarily) to catch all instances of this behavior covered by our CI, and I have updated the tests as needed. There was actually a bug in quantized multiplication where the output size of the tensor was being inferred incorrectly since it didn't account for broadcasting. I fixed that issue, too. Please take a look at the updated language and test fixes. The effect of this change on our own codebase, tests included, seems small. |
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
See #41027.
This adds a helper to resize output to ATen/native/Resize.* and updates TensorIterator to use it. The helper throws a warning if a tensor with one or more elements needs to be resized. This warning indicates that these resizes will become an error in a future PyTorch release.
There are many functions in PyTorch that will resize their outputs and don't use TensorIterator. For example,
pytorch/aten/src/ATen/native/cuda/NaiveConvolutionTranspose2d.cu
Line 243 in 985fd97
And these functions will need to be updated to use this helper, too. This PR avoids their inclusion since the work is separable, and this should let us focus on the function and its behavior in review. A TODO appears in the code to reflect this.