[quant][pt2e] Add more precise representation for quantized add #104130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

jerryzh168 wants to merge 1 commit into pytorch:main from jerryzh168:export-D45628032

Contributor

jerryzh168 commented Jun 23, 2023 •

edited by pytorch-bot bot

Loading

Summary:
The planned e2e for quantization in pytorch 2.0 export is the following:

float_model -> prepare_pt2e -> calibration -> convert_pt2e -> ...

inside convert_pt2e, we will first produce a q/dq representation of the quantized model, similar to the previous output of
convert_to_reference_fx in fx grah mode quantization:

torch.ops.quantized_decomposed.dequantize_per_tensor -> torch.ops.aten.add -> torch.ops.quantized_decomopsed.quantize_per_tensor
torch.ops.quantized_decomposed.dequantize_per_tensor   /

Then we'll rewrite the above to a more precise representation that express the intention in a more precise manner, since
here we actually want to do int8 addition, instead of simulating the int8 addition with fp32 operations, the representation for
quantized add is:

def quantized_add(x_i8, x_scale, x_zero_point, y_i8, y_scale, y_zero_point, out_scale, out_zero_point):
    x = (x_scale / out_scale) * x_i8
    y = (y_scale / out_scale) * y_i8
    out = x + y
    out -= (x_zero_point * x_scale - y_zero_point * y_scale) / out_scale
    out += out_zero_point
    return out

Test Plan:

buck2 test caffe2/test:quantization_pt2e -- --exact 'caffe2/test:quantization_pt2e - test_representation_add (quantization.pt2e.test_quantize_pt2e.TestQuantizePT2E)'

Reviewed By: kimishpatel

Differential Revision: D45628032

cc @voznesenskym @penguinwu @anijain2305 @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @ipiszy @chenyang78

pytorch-bot bot commented Jun 23, 2023 •

edited

Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/104130

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit bbf8bd2:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot bot added the release notes: quantization label

github-actions bot added ciflow/inductor module: dynamo labels

facebook-github-bot added the fb-exported label

Contributor

facebook-github-bot commented Jun 23, 2023

This pull request was exported from Phabricator. Differential Revision: D45628032

jerryzh168 requested review from andrewor14 and kimishpatel

June 23, 2023 23:02

jerryzh168 force-pushed the export-D45628032 branch from 008c63c to b5f3f8e Compare

June 23, 2023 23:17

Contributor

facebook-github-bot commented Jun 23, 2023

This pull request was exported from Phabricator. Differential Revision: D45628032

jerryzh168 added a commit to jerryzh168/pytorch that referenced this pull request


          [quant][pt2e] Add more precise representation for quantized add (pyto…

87e9cfd

…rch#104130)

Summary:
Pull Request resolved: pytorch#104130

The planned e2e for quantization in pytorch 2.0 export is the following:

float_model -> prepare_pt2e -> calibration -> convert_pt2e -> ...

inside convert_pt2e, we will first produce a q/dq representation of the quantized model, similar to the previous output of
convert_to_reference_fx in fx grah mode quantization:

```
torch.ops.quantized_decomposed.dequantize_per_tensor -> torch.ops.aten.add -> torch.ops.quantized_decomopsed.quantize_per_tensor
torch.ops.quantized_decomposed.dequantize_per_tensor   /
```

Then we'll rewrite the above to a more precise representation that express the intention in a more precise manner, since
here we actually want to do int8 addition, instead of simulating the int8 addition with fp32 operations, the representation for
quantized add is:

```
def quantized_add(x_i8, x_scale, x_zero_point, y_i8, y_scale, y_zero_point, out_scale, out_zero_point):
    x = (x_scale / out_scale) * x_i8
    y = (y_scale / out_scale) * y_i8
    out = x + y
    out -= (x_zero_point * x_scale - y_zero_point * y_scale) / out_scale
    out += out_zero_point
    return out
```

Test Plan:
```
buck2 test caffe2/test:quantization_pt2e -- --exact 'caffe2/test:quantization_pt2e - test_representation_add (quantization.pt2e.test_quantize_pt2e.TestQuantizePT2E)'
```

Reviewed By: kimishpatel

Differential Revision: D45628032

fbshipit-source-id: b59cdd27be4c2c72a0bb1629351eff4542370d31

jerryzh168 force-pushed the export-D45628032 branch from b5f3f8e to 87e9cfd Compare

June 23, 2023 23:27

Contributor

facebook-github-bot commented Jun 23, 2023

This pull request was exported from Phabricator. Differential Revision: D45628032

kimishpatel approved these changes

View reviewed changes

leslie-fang-intel reviewed

View reviewed changes

torch/ao/quantization/_pt2e/representation/rewrite.py Outdated Show resolved Hide resolved

jerryzh168 force-pushed the export-D45628032 branch from 87e9cfd to b8ba365 Compare

June 27, 2023 00:04

Contributor

facebook-github-bot commented Jun 27, 2023

This pull request was exported from Phabricator. Differential Revision: D45628032

jerryzh168 force-pushed the export-D45628032 branch from b8ba365 to 8aa1e7f Compare

June 27, 2023 00:16

Contributor

facebook-github-bot commented Jun 27, 2023

This pull request was exported from Phabricator. Differential Revision: D45628032


          [quant][pt2e] Add more precise representation for quantized add (pyto…

bbf8bd2

…rch#104130)

Summary:
Pull Request resolved: pytorch#104130

The planned e2e for quantization in pytorch 2.0 export is the following:

float_model -> prepare_pt2e -> calibration -> convert_pt2e -> ...

inside convert_pt2e, we will first produce a q/dq representation of the quantized model, similar to the previous output of
convert_to_reference_fx in fx grah mode quantization:

```
torch.ops.quantized_decomposed.dequantize_per_tensor -> torch.ops.aten.add -> torch.ops.quantized_decomopsed.quantize_per_tensor
torch.ops.quantized_decomposed.dequantize_per_tensor   /
```

Then we'll rewrite the above to a more precise representation that express the intention in a more precise manner, since
here we actually want to do int8 addition, instead of simulating the int8 addition with fp32 operations, the representation for
quantized add is:

```
def quantized_add(x_i8, x_scale, x_zero_point, y_i8, y_scale, y_zero_point, out_scale, out_zero_point):
    x = (x_scale / out_scale) * x_i8
    y = (y_scale / out_scale) * y_i8
    out = x + y
    out -= (x_zero_point * x_scale - y_zero_point * y_scale) / out_scale
    out += out_zero_point
    return out
```

Test Plan:
```
buck2 test caffe2/test:quantization_pt2e -- --exact 'caffe2/test:quantization_pt2e - test_representation_add (quantization.pt2e.test_quantize_pt2e.TestQuantizePT2E)'
```

Reviewed By: kimishpatel

Differential Revision: D45628032

fbshipit-source-id: 9cbdd0cb398e6cfb97bf1a87526dc87ed6360621

jerryzh168 force-pushed the export-D45628032 branch from 8aa1e7f to bbf8bd2 Compare

June 27, 2023 01:43

Contributor

facebook-github-bot commented Jun 27, 2023

This pull request was exported from Phabricator. Differential Revision: D45628032

Contributor Author

jerryzh168 commented Jun 27, 2023

@pytorchbot merge

pytorch-bot bot added the ciflow/trunk label

pytorchmergebot added the merging label

Collaborator

pytorchmergebot commented Jun 27, 2023

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot added the Merged label

pytorchmergebot removed the merging label

pytorchmergebot closed this in

c98896b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk fb-exported Merged module: dynamo release notes: quantization