Skip to content

Conversation

@Xia-Weiwen
Copy link
Collaborator

@Xia-Weiwen Xia-Weiwen commented Dec 22, 2022

Summary
For onednn quantization backend only.
Currently, FX fusion requires that all separate ops in a fused module/op have the same qconfig. To support linear - leaky_relu and linear - tanh fusion with onednn backend, we previously explicitly set the same qconfig to linear, leaky_relu and tanh. However, this brings two problems:

  • It breaks fusion of linear - relu since relu does not have the same qconfig as linear does. And it does not look good if we set qconfig to all these ops. They should use a global qconfig by default.
  • Tanh requires fixed_qparams_qconfig otherwise it is not quantized. So, we cannot set another qconfig to tanh.

Looks like there is not a straightforward way to solve the problems. This PR fixes them by the following:

  • Do not set qconfig to these ops so that these ops use a global qconfig and linear - relu and linear - leaky_relu can be fused correctly.
  • Set the same qconfig to linear and tanh manually by users when they want to fuse linear - tanh with onednn backend.

A known issue still exists: users cannot fuse linear - tanh and quantize standalone tanh at the same time.

Test plan
python test/test_quantization.py -k test_qconfig_dict_with_fused_modules

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

@pytorch-bot
Copy link

pytorch-bot bot commented Dec 22, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/91297

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit f850cad:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@Xia-Weiwen Xia-Weiwen requested a review from jgong5 December 22, 2022 07:13
@github-actions github-actions bot added the release notes: quantization release notes category label Dec 22, 2022
@Xia-Weiwen Xia-Weiwen force-pushed the fix_onednn_backend_qconfig_issue branch from a5b01ce to e47778d Compare December 22, 2022 11:43
@Xia-Weiwen Xia-Weiwen changed the title [Quant][ONEDNN] set qconfig explicitly for relu [Quant][Fx] Fix issue: qconfig_mappings of onednn backend are not correctly set for fused modules Dec 22, 2022
@Xia-Weiwen Xia-Weiwen added the intel This tag is for PR from Intel label Dec 22, 2022
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does set qconfig to None mean we use FP32 for Tanh? I thought we should use same qconfig for linear and Tanh to enable the fusion, doesn't it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, None means don't quantize

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Good point. I thought None meant global and I meant to use global qconfig for tanh so that tanh and qlinear use the same qconfig. Now that it was wrong and did not work, I have to find a new way to fix it.
It is required that QConfig must specify a FixedQParamsObserver or a FixedQParamsFakeQuantize for fixed qparams ops (e.g. tanh).

if not isinstance(activation_post_process, FixedQParamsObserver) and \

@jerryzh168 Do you know any solution or workaround? Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is no easy way to set correct qconfig_mappings for both tanh and linear - tanh fusion. I removed these lines. Users will need to set qconfig manually if they want to fuse linear - tanh with onednn backend.

@Xia-Weiwen Xia-Weiwen force-pushed the fix_onednn_backend_qconfig_issue branch from e47778d to 5274795 Compare January 4, 2023 02:18
@Xia-Weiwen
Copy link
Collaborator Author

Hi @jgong5 Could you review this PR? Thanks!

Copy link
Collaborator

@jgong5 jgong5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the change. But requiring same qconfig for all the ops in a fusion pattern sounds a too rigid requirement. May I know if this will be improved in the PT 2.0 quant flow? @jerryzh168

@Xia-Weiwen Xia-Weiwen marked this pull request as ready for review January 5, 2023 02:38
@Xia-Weiwen Xia-Weiwen requested a review from z-a-f as a code owner January 5, 2023 02:38
@Xia-Weiwen Xia-Weiwen force-pushed the fix_onednn_backend_qconfig_issue branch 2 times, most recently from c16bb47 to ae7f76b Compare January 10, 2023 07:24
@Xia-Weiwen Xia-Weiwen force-pushed the fix_onednn_backend_qconfig_issue branch from ae7f76b to 0e417e3 Compare January 16, 2023 06:11
@Xia-Weiwen
Copy link
Collaborator Author

Hi @jerryzh168 Please review. Thanks!

Comment on lines +5801 to +5810
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this not part of get_default_qconfig_mapping?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I just saw the summary message

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also please add TODO comment to all callsites that is using this, we want to refactor them later when we have new api ready

Copy link
Collaborator Author

@Xia-Weiwen Xia-Weiwen Jan 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok It's been added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please add a comment here summarizing what you said in the summary? it will be helpful for future reference when we have the new api ready

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok It's been added.

Copy link
Contributor

@jerryzh168 jerryzh168 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a comment in the code to describe the current status of how to set qconfig for linear - tanh fusion

@jerryzh168
Copy link
Contributor

I'm fine with the change. But requiring same qconfig for all the ops in a fusion pattern sounds a too rigid requirement. May I know if this will be improved in the PT 2.0 quant flow? @jerryzh168

yes, this will be improved in quantize_pt2e flow

@Xia-Weiwen Xia-Weiwen force-pushed the fix_onednn_backend_qconfig_issue branch 2 times, most recently from aaf321d to b90b5fe Compare January 18, 2023 08:27
@Xia-Weiwen
Copy link
Collaborator Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 19, 2023
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Not merging any PRs at the moment because there is a merge blocking https://github.com/pytorch/pytorch/labels/ci:%20sev issue open at:
#92602

Details for Dev Infra team Raised by workflow job

@Xia-Weiwen Xia-Weiwen force-pushed the fix_onednn_backend_qconfig_issue branch from b90b5fe to 895cf15 Compare January 20, 2023 05:34
@Xia-Weiwen Xia-Weiwen force-pushed the fix_onednn_backend_qconfig_issue branch from 895cf15 to f850cad Compare January 26, 2023 02:48
@Xia-Weiwen
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit that referenced this pull request Jan 27, 2023
#90001)

**Summary**
This work continues with #83784 by @vkuzo and includes all the changes in that PR.
Quote from #83784:
> Issue #83658 reports that ops followed by a certain pattern of `view` and `size` ops were not quantized correctly by FX graph mode quantization.
Before this PR, the "size" op was in the "op shares qparams with input" category, and the code assumed that the input of this op has the same dtype as its output. This led to incorrectly propagating the `int` dtype as the output of whichever op was preceding the `view` op, which in turn made that op blocklisted from quantization.

> The fix is to create a new category of ops which work on different dtypes of tensors but are not observed. This PR does so for `size`, and also for `shape` since it works the same way.

**Note**: This PR needs #91297 to be landed first otherwise there is a UT failure.

**Test plan**
```
python test/test_quantization.py -k test_linear_size_view
python test/test_quantization.py -k test_linear_shape_view
```

Pull Request resolved: #90001
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
@Xia-Weiwen Xia-Weiwen deleted the fix_onednn_backend_qconfig_issue branch November 13, 2024 06:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request intel This tag is for PR from Intel Merged open source release notes: AO frontend release notes: quantization release notes category

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants