-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[Quant][core][improvements] Combined dispatch registration for max_pool2d & quantized_max_pool2d and implemented max_pool2d_with_indices_out_quantized_cpu #72353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…quantized_max_pool2d [ghstack-poisoned]
CI Flow Status⚛️ CI FlowRuleset - Version:
|
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 8ef7c8c (more details on the Dr. CI page):
🕵️ 9 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages
|
…x_pool2d & quantized_max_pool2d" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool2d, and implements a quantized kernel for max_pool2d_with_indices. In addition, quantized_max_pool2d has been removed from the frontend, which was previously not used anyways. [ghstack-poisoned]
…for max_pool2d & quantized_max_pool2d" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool2d, and implements a quantized kernel for max_pool2d_with_indices. In addition, quantized_max_pool2d has been removed from the frontend, which was previously not used anyways. [ghstack-poisoned]
…l2d & quantized_max_pool2d Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool2d, and implements a quantized kernel for max_pool2d_with_indices. In addition, quantized_max_pool2d has been removed from the frontend, which was previously not used anyways. ghstack-source-id: 57b9ac5 Pull Request resolved: #72353
|
@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
| // the output tensor is already computed from quantized_max_pool2d. junk_out is used as a dummy | ||
| // argument because it is required by max_pool2d_kernel | ||
| auto junk_out = at::empty(out.sizes(), self.int_repr().options()); | ||
| max_pool2d_kernel(kCPU, junk_out, indices, self.int_repr(), kW, kH, dW, dH, padW, padH, dilationW, dilationH); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should be able to get the quantized output (quint8) based on the output of max_pool2d_kernel (uint8) I think, the output of max_pool2d_kernel will be the int_repr of the actual quantized output, we can make a quantized tensor with _make_per_tensor_quantized_tensor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the code as written looks awful, you're running max_pool2d twice to get out the quantized output and then the indices. Terrible. Please do this modification.
|
test failures look alarming |
| switch (input.suggest_memory_format()) { | ||
| case at::MemoryFormat::Contiguous: { | ||
| AT_DISPATCH_FLOATING_TYPES_AND(ScalarType::BFloat16, input.scalar_type(), "max_pool2d", [&] { | ||
| AT_DISPATCH_ALL_TYPES_AND(ScalarType::BFloat16, input.scalar_type(), "max_pool2d", [&] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like you will need an OpInfo update
ezyang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't run max pool2d twice just cuz you need indices lol
…for max_pool2d & quantized_max_pool2d" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool2d, and implements a quantized kernel for max_pool2d_with_indices. In addition, quantized_max_pool2d has been removed from the frontend, which was previously not used anyways. This PR also introduces isnan() support for vectorized int tensors. Differential Revision: [D34085075](https://our.internmc.facebook.com/intern/diff/D34085075) [ghstack-poisoned]
|
@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
… for max_pool2d & quantized_max_pool2d and implemented max_pool2d_with_indices_out_quantized_cpu" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool2d, and implements a quantized kernel for max_pool2d_with_indices. This PR also introduces isnan() support for vectorized int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors. Test plan: ``` python test/test_quantization.py -k test_max_pool2d ``` Differential Revision: [D35420901](https://our.internmc.facebook.com/intern/diff/D35420901) [ghstack-poisoned]
…tch registration for max_pool1d & quantized_max_pool1d" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool1d and modifies max_pool1d_impl to be compatible with int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors and #72353. Test plan: ``` python test/test_quantization.py -k test_max_pool1d ``` Differential Revision: [D35431831](https://our.internmc.facebook.com/intern/diff/D35431831) [ghstack-poisoned]
… for max_pool1d & quantized_max_pool1d" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool1d and modifies max_pool1d_impl to be compatible with int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors and #72353. Test plan: ``` python test/test_quantization.py -k test_max_pool1d ``` Differential Revision: [D35431831](https://our.internmc.facebook.com/intern/diff/D35431831) [ghstack-poisoned]
|
@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
… for max_pool2d & quantized_max_pool2d and implemented max_pool2d_with_indices_out_quantized_cpu" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool2d, and implements a quantized kernel for max_pool2d_with_indices. This PR also introduces isnan() support for vectorized int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors. Test plan: ``` python test/test_quantization.py -k test_max_pool2d ``` Differential Revision: [D35420901](https://our.internmc.facebook.com/intern/diff/D35420901) [ghstack-poisoned]
…tch registration for max_pool1d & quantized_max_pool1d" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool1d and modifies max_pool1d_impl to be compatible with int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors and #72353. Test plan: ``` python test/test_quantization.py -k test_max_pool1d ``` Differential Revision: [D35431831](https://our.internmc.facebook.com/intern/diff/D35431831) [ghstack-poisoned]
… for max_pool1d & quantized_max_pool1d" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool1d and modifies max_pool1d_impl to be compatible with int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors and #72353. Test plan: ``` python test/test_quantization.py -k test_max_pool1d ``` Differential Revision: [D35431831](https://our.internmc.facebook.com/intern/diff/D35431831) [ghstack-poisoned]
|
@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
| const int dilationW = dilation.size() == 1 ? dilationH : safe_downcast<int, int64_t>(dilation[1]); | ||
|
|
||
| max_pool2d_kernel(kCPU, out, indices, self, kW, kH, dW, dH, padW, padH, dilationW, dilationH); | ||
| set_quantizer_(out, make_per_tensor_affine_quantizer(self.q_scale(), self.q_zero_point(), out.scalar_type())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
random question that is probably not the fault of this PR: is it really correct to set_quantizer_ on the out argument like this? Suppose I take a view of a quantized tensor and then write into it, the quantizers of the view and the base what then become inconsistent. That seems bad!
… for max_pool2d & quantized_max_pool2d and implemented max_pool2d_with_indices_out_quantized_cpu" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool2d, and implements a quantized kernel for max_pool2d_with_indices. This PR also introduces isnan() support for vectorized int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors. Test plan: ``` python test/test_quantization.py -k test_max_pool2d ``` Differential Revision: [D35420901](https://our.internmc.facebook.com/intern/diff/D35420901) [ghstack-poisoned]
…tch registration for max_pool1d & quantized_max_pool1d" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool1d and modifies max_pool1d_impl to be compatible with int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors and #72353. Test plan: ``` python test/test_quantization.py -k test_max_pool1d ``` Differential Revision: [D35431831](https://our.internmc.facebook.com/intern/diff/D35431831) [ghstack-poisoned]
… for max_pool1d & quantized_max_pool1d" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool1d and modifies max_pool1d_impl to be compatible with int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors and #72353. Test plan: ``` python test/test_quantization.py -k test_max_pool1d ``` Differential Revision: [D35431831](https://our.internmc.facebook.com/intern/diff/D35431831) [ghstack-poisoned]
|
@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
… for max_pool2d & quantized_max_pool2d and implemented max_pool2d_with_indices_out_quantized_cpu" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool2d, and implements a quantized kernel for max_pool2d_with_indices. This PR also introduces isnan() support for vectorized int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors. Test plan: ``` python test/test_quantization.py -k test_max_pool2d ``` Differential Revision: [D35420901](https://our.internmc.facebook.com/intern/diff/D35420901) [ghstack-poisoned]
…tch registration for max_pool1d & quantized_max_pool1d" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool1d and modifies max_pool1d_impl to be compatible with int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors and #72353. Test plan: ``` python test/test_quantization.py -k test_max_pool1d ``` Differential Revision: [D35431831](https://our.internmc.facebook.com/intern/diff/D35431831) [ghstack-poisoned]
… for max_pool1d & quantized_max_pool1d" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool1d and modifies max_pool1d_impl to be compatible with int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors and #72353. Test plan: ``` python test/test_quantization.py -k test_max_pool1d ``` Differential Revision: [D35431831](https://our.internmc.facebook.com/intern/diff/D35431831) [ghstack-poisoned]
|
@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
| bool res{false}; | ||
| c10::guts::if_constexpr<std::is_integral<scalar_t>::value> ( | ||
| [&res] () { res = false; }, // if integral type | ||
| [&res, val] () { res = std::isnan(val); } // if not integral type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just define a helper function
… for max_pool2d & quantized_max_pool2d and implemented max_pool2d_with_indices_out_quantized_cpu" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool2d, and implements a quantized kernel for max_pool2d_with_indices. This PR also introduces isnan() support for vectorized int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors. Test plan: ``` python test/test_quantization.py -k test_max_pool2d ``` Differential Revision: [D35420901](https://our.internmc.facebook.com/intern/diff/D35420901) [ghstack-poisoned]
…tch registration for max_pool1d & quantized_max_pool1d" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool1d and modifies max_pool1d_impl to be compatible with int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors and #72353. Test plan: ``` python test/test_quantization.py -k test_max_pool1d ``` Differential Revision: [D35431831](https://our.internmc.facebook.com/intern/diff/D35431831) [ghstack-poisoned]
… for max_pool1d & quantized_max_pool1d" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool1d and modifies max_pool1d_impl to be compatible with int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors and #72353. Test plan: ``` python test/test_quantization.py -k test_max_pool1d ``` Differential Revision: [D35431831](https://our.internmc.facebook.com/intern/diff/D35431831) [ghstack-poisoned]
|
@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
… for max_pool2d & quantized_max_pool2d and implemented max_pool2d_with_indices_out_quantized_cpu" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool2d, and implements a quantized kernel for max_pool2d_with_indices. This PR also introduces isnan() support for vectorized int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors. Test plan: ``` python test/test_quantization.py -k test_max_pool2d ``` Differential Revision: [D35420901](https://our.internmc.facebook.com/intern/diff/D35420901) [ghstack-poisoned]
…tch registration for max_pool1d & quantized_max_pool1d" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool1d and modifies max_pool1d_impl to be compatible with int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors and #72353. Test plan: ``` python test/test_quantization.py -k test_max_pool1d ``` Differential Revision: [D35431831](https://our.internmc.facebook.com/intern/diff/D35431831) [ghstack-poisoned]
… for max_pool1d & quantized_max_pool1d" Summary: This PR is part of a series of PRs addressing #54150, related to using dispatcher for calls to quantized backends as opposed to if/else conditionals. This particular PR removes the is_quantized check from max_pool1d and modifies max_pool1d_impl to be compatible with int tensors. This PR relies on #74560, which introduces structured kernel support for quantized tensors and #72353. Test plan: ``` python test/test_quantization.py -k test_max_pool1d ``` Differential Revision: [D35431831](https://our.internmc.facebook.com/intern/diff/D35431831) [ghstack-poisoned]
|
@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
|
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
Stack from ghstack (oldest at bottom):
Summary: This PR is part of a series of PRs addressing #54150,
related to using dispatcher for calls to quantized backends as opposed to if/else conditionals.
This particular PR removes the is_quantized check from max_pool2d, and implements a quantized
kernel for max_pool2d_with_indices.
This PR also introduces isnan() support for vectorized int tensors.
This PR relies on #74560, which introduces
structured kernel support for quantized tensors.
Test plan:
Differential Revision: D35420901