[quant][core] Add quantize/dequantize ops for decomposed quantized Tensor representation #87093

jerryzh168 · 2022-10-17T17:07:03Z

Stack from ghstack (oldest at bottom):

-> [quant][core] Add quantize/dequantize ops for decomposed quantized Tensor representation #87093

Summary:
Added q/dq implementation for out of core (decomposed) quantized Tensor representation, meaning that
instead of storing quantization parameters (e.g. scale/zero_point) in a separate quantized Tensor object, we will store
quantization parameters in the argument of operators.

quantize(float32_tensor, scale, zero_point, dtype) -> int8_tensor
dequantize(int8_tensor, scale, zero_point, dtype) -> float32_tensor

Test Plan:
python test/test_quantization.py TestQuantizedTensor.test_decomposed_quantize
python test/test_quantization.py TestQuantizedTensor.test_decomposed_dequantize

Reviewers:

Subscribers:

Tasks:

Tags:

…nsor representation Summary: Added q/dq implementation for out of core (decomposed) quantized Tensor representation, meaning that instead of storing quantization parameters (e.g. scale/zero_point) in a separate quantized Tensor object, we will store quantization parameters in the argument of operators. ``` quantize(float32_tensor, scale, zero_point, dtype) -> int8_tensor dequantize(int8_tensor, scale, zero_point, dtype) -> float32_tensor ``` Test Plan: python test/test_quantization.py TestQuantizedTensor.test_decomposed_quantize python test/test_quantization.py TestQuantizedTensor.test_decomposed_dequantize Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

pytorch-bot · 2022-10-17T17:07:07Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/87093

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures, 1 Pending

As of commit 88b0cb8:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…uantized Tensor representation" Summary: Added q/dq implementation for out of core (decomposed) quantized Tensor representation, meaning that instead of storing quantization parameters (e.g. scale/zero_point) in a separate quantized Tensor object, we will store quantization parameters in the argument of operators. ``` quantize(float32_tensor, scale, zero_point, dtype) -> int8_tensor dequantize(int8_tensor, scale, zero_point, dtype) -> float32_tensor ``` Test Plan: python test/test_quantization.py TestQuantizedTensor.test_decomposed_quantize python test/test_quantization.py TestQuantizedTensor.test_decomposed_dequantize Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

dzdang

lgtm

aten/src/ATen/native/quantized/decomposed/DeQuantize.cpp

aten/src/ATen/native/quantized/decomposed/Quantize.cpp

…uantized Tensor representation" Summary: Added q/dq implementation for out of core (decomposed) quantized Tensor representation, meaning that instead of storing quantization parameters (e.g. scale/zero_point) in a separate quantized Tensor object, we will store quantization parameters in the argument of operators. ``` quantize(float32_tensor, scale, zero_point, dtype) -> int8_tensor dequantize(int8_tensor, scale, zero_point, dtype) -> float32_tensor ``` Test Plan: python test/test_quantization.py TestQuantizedTensor.test_decomposed_quantize python test/test_quantization.py TestQuantizedTensor.test_decomposed_dequantize Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

jerryzh168 · 2022-10-18T22:06:19Z

@pytorchbot merge -g

pytorchmergebot · 2022-10-18T22:07:46Z

Merge started

Your change will be merged once all checks on your PR pass since you used the green (-g) flag (ETA: 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2022-10-18T22:17:53Z

Merge failed

Reason: The following mandatory check(s) failed (Rule superuser):

Lint
pull

Dig deeper by viewing the failures on hud

Details for Dev Infra team

Raised by workflow job

…uantized Tensor representation" Summary: Added q/dq implementation for out of core (decomposed) quantized Tensor representation, meaning that instead of storing quantization parameters (e.g. scale/zero_point) in a separate quantized Tensor object, we will store quantization parameters in the argument of operators. ``` quantize(float32_tensor, scale, zero_point, dtype) -> int8_tensor dequantize(int8_tensor, scale, zero_point, dtype) -> float32_tensor ``` Test Plan: python test/test_quantization.py TestQuantizedTensor.test_decomposed_quantize python test/test_quantization.py TestQuantizedTensor.test_decomposed_dequantize Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

pytorchmergebot · 2022-10-24T23:07:57Z

Merge started

Your change will be merged once all checks on your PR pass since you used the green (-g) flag (ETA: 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2022-10-24T23:08:00Z

Merge failed

Reason: The following mandatory check(s) failed (Rule superuser):

pull

Dig deeper by viewing the failures on hud

Details for Dev Infra team

Raised by workflow job

jerryzh168 · 2022-10-24T23:48:09Z

@pytorchbot merge -g

pytorchmergebot · 2022-10-24T23:52:28Z

Merge started

Your change will be merged once all checks on your PR pass since you used the green (-g) flag (ETA: 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2022-10-24T23:52:31Z

Merge failed

Reason: The following mandatory check(s) failed (Rule superuser):

pull

Dig deeper by viewing the failures on hud

Details for Dev Infra team

Raised by workflow job

jerryzh168 · 2022-10-25T01:24:24Z

@pytorchbot merge

pytorchmergebot · 2022-10-25T01:26:10Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2022-10-25T01:26:13Z

Merge failed

Reason: The following mandatory check(s) failed (Rule superuser):

pull

Dig deeper by viewing the failures on hud

Details for Dev Infra team

Raised by workflow job

jerryzh168 · 2022-10-25T04:31:08Z

@pytorchbot rebase

pytorchmergebot · 2022-10-25T04:32:42Z

@pytorchbot successfully started a rebase job. Check the current status here

…uantized Tensor representation" Summary: Added q/dq implementation for out of core (decomposed) quantized Tensor representation, meaning that instead of storing quantization parameters (e.g. scale/zero_point) in a separate quantized Tensor object, we will store quantization parameters in the argument of operators. ``` quantize(float32_tensor, scale, zero_point, dtype) -> int8_tensor dequantize(int8_tensor, scale, zero_point, dtype) -> float32_tensor ``` Test Plan: python test/test_quantization.py TestQuantizedTensor.test_decomposed_quantize python test/test_quantization.py TestQuantizedTensor.test_decomposed_dequantize Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

pytorchmergebot · 2022-10-25T04:32:56Z

Successfully rebased gh/jerryzh168/803/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/87093)

…nsor representation Summary: Added q/dq implementation for out of core (decomposed) quantized Tensor representation, meaning that instead of storing quantization parameters (e.g. scale/zero_point) in a separate quantized Tensor object, we will store quantization parameters in the argument of operators. ``` quantize(float32_tensor, scale, zero_point, quant_min, quant_max, dtype) -> int8_tensor dequantize(int8_tensor, scale, zero_point, quant_min, quant_max, dtype) -> float32_tensor ``` Test Plan: python test/test_quantization.py TestQuantizedTensor.test_decomposed_quantize python test/test_quantization.py TestQuantizedTensor.test_decomposed_dequantize Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: df27483 Pull Request resolved: #87093

jerryzh168 · 2022-10-25T17:37:23Z

@pytorchbot rebase

pytorchmergebot · 2022-10-25T17:39:17Z

@pytorchbot successfully started a rebase job. Check the current status here

…uantized Tensor representation" Summary: Added q/dq implementation for out of core (decomposed) quantized Tensor representation, meaning that instead of storing quantization parameters (e.g. scale/zero_point) in a separate quantized Tensor object, we will store quantization parameters in the argument of operators. ``` quantize(float32_tensor, scale, zero_point, dtype) -> int8_tensor dequantize(int8_tensor, scale, zero_point, dtype) -> float32_tensor ``` Test Plan: python test/test_quantization.py TestQuantizedTensor.test_decomposed_quantize python test/test_quantization.py TestQuantizedTensor.test_decomposed_dequantize Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

pytorchmergebot · 2022-10-25T17:39:30Z

Successfully rebased gh/jerryzh168/803/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/87093)

…nsor representation Summary: Added q/dq implementation for out of core (decomposed) quantized Tensor representation, meaning that instead of storing quantization parameters (e.g. scale/zero_point) in a separate quantized Tensor object, we will store quantization parameters in the argument of operators. ``` quantize(float32_tensor, scale, zero_point, quant_min, quant_max, dtype) -> int8_tensor dequantize(int8_tensor, scale, zero_point, quant_min, quant_max, dtype) -> float32_tensor ``` Test Plan: python test/test_quantization.py TestQuantizedTensor.test_decomposed_quantize python test/test_quantization.py TestQuantizedTensor.test_decomposed_dequantize Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: e3d3419 Pull Request resolved: #87093

larryliu0820 · 2022-10-25T17:51:37Z

torch/ao/quantization/fx/_decomposed.py

+quantized_decomposed_lib.define(
+    "quantize_per_tensor(Tensor input, float scale, int zero_point, int quant_min, int quant_max, ScalarType dtype) -> Tensor")


So we need to register exactly the same schema somewhere into edge runtime. It seems a bit hard to maintain for current infra. If we change schema here we need to change schema in edge runtime otherwise it won't work.

so should we define these in native_functions.yaml?

z-a-f · 2022-10-25T18:11:05Z

torch/ao/quantization/fx/_decomposed.py

+    assert input.dtype == dtype, f"Expecting input to have dtype: {dtype}"
+    if dtype in [torch.uint8, torch.int8]:
+        # TODO: investigate why
+        # (input - zero_point).to(torch.float32) * scale


nit: I think this fails because of the under-/overflow? For example, if dtype(input) == uint8, input == 0, and zero_point > 0, you will have wrapped around values:

>>> import torch >>> a = torch.tensor([1,2,3], dtype=torch.uint8) >>> a - 3 tensor([254, 255, 0], dtype=torch.uint8)

yeah could be, but I remember seeing fbgemm is doing the same: https://github.com/pytorch/FBGEMM/blob/main/include/fbgemm/QuantUtils.h#L139-L141, or maybe the subtraction in (src - qparams.zero_point) is an int32 since qparams.zero_point is int32: https://github.com/pytorch/FBGEMM/blob/main/include/fbgemm/QuantUtilsAvx2.h#L21

z-a-f · 2022-10-25T18:22:07Z

torch/ao/quantization/fx/_decomposed.py

+import torch
+from torch.library import Library, impl
+
+quantized_decomposed_lib = Library("quantized_decomposed", "DEF")


This name is giving me jitters :)

jerryzh168 · 2022-10-25T23:48:53Z

@pytorchbot merge

pytorchmergebot · 2022-10-25T23:50:38Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

github-actions · 2022-10-25T23:52:34Z

Hey @jerryzh168.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

…nsor representation (pytorch#87093) Summary: Added q/dq implementation for out of core (decomposed) quantized Tensor representation, meaning that instead of storing quantization parameters (e.g. scale/zero_point) in a separate quantized Tensor object, we will store quantization parameters in the argument of operators. ``` quantize(float32_tensor, scale, zero_point, dtype) -> int8_tensor dequantize(int8_tensor, scale, zero_point, dtype) -> float32_tensor ``` Test Plan: python test/test_quantization.py TestQuantizedTensor.test_decomposed_quantize python test/test_quantization.py TestQuantizedTensor.test_decomposed_dequantize Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: pytorch#87093 Approved by: https://github.com/dzdang, https://github.com/z-a-f

jerryzh168 requested review from digantdesai, jianyuh, kimishpatel, salilsdesai and z-a-f as code owners October 17, 2022 17:07

pytorch-bot bot added the release notes: quantization release notes category label Oct 17, 2022

jerryzh168 mentioned this pull request Oct 17, 2022

[quant][fx] Add _convert_to_reference_decomposed #87094

Closed

jerryzh168 requested review from andrewor14 and vkuzo October 17, 2022 17:33

dzdang approved these changes Oct 17, 2022

View reviewed changes

aten/src/ATen/native/quantized/decomposed/DeQuantize.cpp Outdated Show resolved Hide resolved

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 17, 2022

vkuzo reviewed Oct 17, 2022

View reviewed changes

aten/src/ATen/native/quantized/decomposed/Quantize.cpp Outdated Show resolved Hide resolved

jerryzh168 mentioned this pull request Oct 18, 2022

[quant][be] Remove unused function quantize_node #87153

Closed

jerryzh168 added 4 commits October 17, 2022 18:43

jerryzh168 requested a review from vkuzo October 18, 2022 22:06

jerryzh168 added 2 commits October 18, 2022 15:44

jerryzh168 mentioned this pull request Oct 19, 2022

[fx][subgraph_rewriter] Change match_filter to be a List in replace_pattern_with_filters #87257

Closed

jerryzh168 added 2 commits October 18, 2022 19:44

larryliu0820 reviewed Oct 25, 2022

View reviewed changes

z-a-f approved these changes Oct 25, 2022

View reviewed changes

pytorchmergebot added the Merged label Oct 25, 2022

pytorchmergebot closed this in 7ab6f56 Oct 25, 2022

facebook-github-bot deleted the gh/jerryzh168/803/head branch June 8, 2023 17:36

		quantized_decomposed_lib.define(
		"quantize_per_tensor(Tensor input, float scale, int zero_point, int quant_min, int quant_max, ScalarType dtype) -> Tensor")

[quant][core] Add quantize/dequantize ops for decomposed quantized Tensor representation #87093

[quant][core] Add quantize/dequantize ops for decomposed quantized Tensor representation #87093

Uh oh!

Conversation

jerryzh168 commented Oct 17, 2022 • edited by pytorchmergebot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/87093

✅ No Failures, 1 Pending

Uh oh!

dzdang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jerryzh168 commented Oct 18, 2022

Uh oh!

pytorchmergebot commented Oct 18, 2022

Merge started

Uh oh!

pytorchmergebot commented Oct 18, 2022

Merge failed

Uh oh!

pytorchmergebot commented Oct 24, 2022

Merge started

Uh oh!

pytorchmergebot commented Oct 24, 2022

Merge failed

Uh oh!

jerryzh168 commented Oct 24, 2022

Uh oh!

pytorchmergebot commented Oct 24, 2022

Merge started

Uh oh!

pytorchmergebot commented Oct 24, 2022

Merge failed

Uh oh!

jerryzh168 commented Oct 25, 2022

Uh oh!

pytorchmergebot commented Oct 25, 2022

Merge started

Uh oh!

pytorchmergebot commented Oct 25, 2022

Merge failed

Uh oh!

jerryzh168 commented Oct 25, 2022

Uh oh!

pytorchmergebot commented Oct 25, 2022

Uh oh!

pytorchmergebot commented Oct 25, 2022

Uh oh!

jerryzh168 commented Oct 25, 2022

Uh oh!

pytorchmergebot commented Oct 25, 2022

Uh oh!

pytorchmergebot commented Oct 25, 2022

Uh oh!

larryliu0820 Oct 25, 2022

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Oct 25, 2022

Choose a reason for hiding this comment

Uh oh!

z-a-f Oct 25, 2022

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Oct 25, 2022

Choose a reason for hiding this comment

Uh oh!

z-a-f Oct 25, 2022

Choose a reason for hiding this comment

Uh oh!

jerryzh168 commented Oct 25, 2022

Uh oh!

pytorchmergebot commented Oct 25, 2022

Merge started

Uh oh!

jerryzh168 commented Oct 17, 2022 •

edited by pytorchmergebot

Loading

pytorch-bot bot commented Oct 17, 2022 •

edited

Loading