[Quant][FX] Add backend config for onednn backend and fuse Linear-LeakyReLU #88665

Xia-Weiwen · 2022-11-08T09:58:52Z

Stack from ghstack (oldest at bottom):

Summary
Add backend config for onednn backend so that it can support more post op fusion for int8 inference. First Linear - LeakyReLU fusion is implemented based on previous PRs.

Test plan
python test_quantization.py TestFuseFx

cc @jerryzh168 @jianyuh @raghuramank100 @jamesr66a @vkuzo @jgong5 @leslie-fang-intel @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

[ghstack-poisoned]

pytorch-bot · 2022-11-08T09:58:54Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/88665

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 680c177:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo jgong5 leslie-fang-intel [ghstack-poisoned]

ghstack-source-id: 55fef92 Pull Request resolved: #88665

cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo jgong5 leslie-fang-intel [ghstack-poisoned]

jerryzh168 · 2022-11-12T03:35:06Z

torch/ao/quantization/qconfig_mapping.py

        .set_object_type(torch.nn.LayerNorm, qconfig_layernorm) \

+    if backend == 'onednn':
+        qconfig_mapping.set_object_type(torch.nn.LeakyReLU, qconfig) \


do we support quantization for standalone leakyrelu module/op?

I think it's required that the fused module shares the same qconfig with separate modules/ops otherwise there will be an error in UT. So, I added these here.

OK, sounds good, we should be able to configure qconfig for patterns I think, but this will come up a bit later, cc @andrewor14 as FYI

can you add a TODO comment here? so that we will remember to remove this later when we support fusion patterns in QConfigMapping

OK, I have added a TODO comment in this PR: #89188

jerryzh168 · 2022-11-12T03:37:00Z

torch/ao/quantization/backend_config/onednn.py

+        .set_backend_pattern_configs(_get_binary_op_configs(binary_op_dtype_configs)) \
+        .set_backend_pattern_config(_get_cat_config(default_op_dtype_configs)) \
+        .set_backend_pattern_configs(_get_default_op_configs(default_op_dtype_configs)) \
+        .set_backend_pattern_configs(_get_fixed_qparams_op_configs(fixed_qparams_op_dtype_configs)) \
+        .set_backend_pattern_configs(_get_share_qparams_op_configs(share_qparams_op_dtype_configs)) \
+        .set_backend_pattern_configs(_get_bn_configs(default_op_dtype_configs)) \
+        .set_backend_pattern_configs(_get_rnn_op_configs(rnn_op_dtype_configs)) \
+        .set_backend_pattern_configs(_get_embedding_op_configs(embedding_op_dtype_configs))


are you sure onednn actually supports all the ops here?

Well, I thought we needed all these configs set so I just copied them here. I will try to remove them.

They are removed

cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo jgong5 leslie-fang-intel [ghstack-poisoned]

… Linear-LeakyReLU" **Summary** Add backend config for onednn backend so that it can support more post op fusion for int8 inference. First `Linear - LeakyReLU` fusion is implemented based on previous PRs. **Test plan** python test_quantization.py TestFuseFx cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo jgong5 leslie-fang-intel [ghstack-poisoned]

test/quantization/fx/test_quantize_fx.py

… Linear-LeakyReLU" **Summary** Add backend config for onednn backend so that it can support more post op fusion for int8 inference. First `Linear - LeakyReLU` fusion is implemented based on previous PRs. **Test plan** python test_quantization.py TestFuseFx cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo jgong5 leslie-fang-intel [ghstack-poisoned]

… Linear-LeakyReLU" **Summary** Add backend config for onednn backend so that it can support more post op fusion for int8 inference. First `Linear - LeakyReLU` fusion is implemented based on previous PRs. **Test plan** python test_quantization.py TestFuseFx cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo jgong5 leslie-fang-intel mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]

Xia-Weiwen · 2022-12-07T02:08:15Z

Hi @jerryzh168. Do you have more comments? Thanks!

Xia-Weiwen · 2022-12-09T01:18:54Z

Hi @jerryzh168. Is it ok to land this? Thanks

jerryzh168 · 2022-12-09T16:52:57Z

torch/ao/quantization/backend_config/onednn.py

+    return BackendConfig("onednn") \
+        .set_backend_pattern_configs(conv_configs) \
+        .set_backend_pattern_configs(linear_configs)


onednn backend only supports these?

@jerryzh168 For other ops, do I need to copy default configs here or just ignore them? What is expected? Here is your previous comment: #88665 (comment)

Looks like they are needed. I have added them back. Please take a look again. Thanks.

OK looks good, so for the other ops do we just fallback to the default (fbgemm) implementation? e.g. quantized::layer_norm

Yes, we use default native implementations for ops other than conv/linear. However, we still need to set pattern configs for other ops here. Otherwise, those ops are not quantized if we use onednn's backend config for prepare_fx and convert_fx.

… Linear-LeakyReLU" **Summary** Add backend config for onednn backend so that it can support more post op fusion for int8 inference. First `Linear - LeakyReLU` fusion is implemented based on previous PRs. **Test plan** python test_quantization.py TestFuseFx cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo jgong5 leslie-fang-intel mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]

jerryzh168 · 2022-12-15T03:30:38Z

torch/ao/quantization/backend_config/onednn.py

+# 1.1 linear module + leaky_relu fusion config
+# linear leaky_relu, linear module + leaky_relu module
+linear_configs.append(
+    BackendPatternConfig((nn.LeakyReLU, nn.Linear))


nit: we updated the pattern format recently, please take a look at #90698

Thanks. It's fixed.

… Linear-LeakyReLU" **Summary** Add backend config for onednn backend so that it can support more post op fusion for int8 inference. First `Linear - LeakyReLU` fusion is implemented based on previous PRs. **Test plan** python test_quantization.py TestFuseFx cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo jgong5 leslie-fang-intel mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]

Xia-Weiwen · 2022-12-17T03:31:20Z

@pytorchbot merge

pytorchmergebot · 2022-12-17T03:33:03Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…kyReLU ghstack-source-id: decdc40 Pull Request resolved: pytorch/pytorch#88665

[Quant][FX] Add backend config for onednn backend

c1224f3

[ghstack-poisoned]

Xia-Weiwen mentioned this pull request Nov 8, 2022

[Quant] Add fused linear-leaky_relu op for onednn backend #88478

Closed

pytorch-bot bot added the release notes: AO frontend label Nov 8, 2022

Xia-Weiwen mentioned this pull request Nov 8, 2022

[Quant] Add fused LinearLeakyReLU module for onednn backend #88661

Closed

github-actions bot added the oncall: quantization Quantization support in PyTorch label Nov 8, 2022

Xia-Weiwen marked this pull request as draft November 8, 2022 10:00

Xia-Weiwen requested a review from jgong5 November 8, 2022 10:01

pytorchbot added the open source label Nov 8, 2022

Update on "[Quant][FX] Add backend config for onednn backend"

5365d79

cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo jgong5 leslie-fang-intel [ghstack-poisoned]

Xia-Weiwen added a commit that referenced this pull request Nov 8, 2022

[Quant][FX] Add backend config for onednn backend

0ab6eb8

ghstack-source-id: 55fef92 Pull Request resolved: #88665

Xia-Weiwen mentioned this pull request Nov 8, 2022

[Quant][FX] Lower QLinearLeakyReLU for onednn backend #88668

Closed

jgong5 approved these changes Nov 9, 2022

View reviewed changes

Update on "[Quant][FX] Add backend config for onednn backend"

bbd8e83

cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo jgong5 leslie-fang-intel [ghstack-poisoned]

Xia-Weiwen marked this pull request as ready for review November 11, 2022 05:30

Update on "[Quant][FX] Add backend config for onednn backend"

7669bf1

cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo jgong5 leslie-fang-intel [ghstack-poisoned]

This was referenced Nov 11, 2022

[Quant] Add fused linear-tanh op for onednn backend #88879

Closed

[Quant] Add fused LinearTanh module for onednn backend #88923

Closed

jerryzh168 reviewed Nov 12, 2022

View reviewed changes

Update on "[Quant][FX] Add backend config for onednn backend"

dad6e3e

cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo jgong5 leslie-fang-intel [ghstack-poisoned]

Xia-Weiwen mentioned this pull request Nov 17, 2022

[Quant] lower fused LinearTanh for onednn backend #89188

Closed

Xia-Weiwen requested a review from z-a-f as a code owner November 17, 2022 05:45

Xia-Weiwen changed the title ~~[Quant][FX] Add backend config for onednn backend~~ [Quant][FX] Add backend config for onednn backend and fuse Linear-LeakyReLU Nov 17, 2022

Xia-Weiwen added 2 commits November 17, 2022 14:03

Xia-Weiwen commented Nov 17, 2022

View reviewed changes

test/quantization/fx/test_quantize_fx.py Show resolved Hide resolved

Xia-Weiwen requested a review from jerryzh168 November 18, 2022 00:36

Xia-Weiwen added the intel This tag is for PR from Intel label Nov 21, 2022

Xia-Weiwen added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 25, 2022

Xia-Weiwen added 4 commits November 29, 2022 14:48

github-actions bot added the release notes: quantization release notes category label Dec 5, 2022

jerryzh168 reviewed Dec 9, 2022

View reviewed changes

jerryzh168 approved these changes Dec 9, 2022

View reviewed changes

Xia-Weiwen added 4 commits December 12, 2022 10:44

jerryzh168 reviewed Dec 15, 2022

View reviewed changes

Xia-Weiwen added 3 commits December 15, 2022 16:10

pytorchmergebot added the Merged label Dec 17, 2022

pytorchmergebot closed this in 7b0ec67 Dec 17, 2022

hasanyeganeh pushed a commit to hasanyeganeh/pytorch-pytorch that referenced this pull request Dec 21, 2022

[Quant][FX] Add backend config for onednn backend and fuse Linear-Lea…

74618ca

…kyReLU ghstack-source-id: decdc40 Pull Request resolved: pytorch/pytorch#88665

facebook-github-bot deleted the gh/Xia-Weiwen/3/head branch June 8, 2023 14:57

[Quant][FX] Add backend config for onednn backend and fuse Linear-LeakyReLU #88665

[Quant][FX] Add backend config for onednn backend and fuse Linear-LeakyReLU #88665

Uh oh!

Conversation

Xia-Weiwen commented Nov 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/88665

✅ No Failures

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Xia-Weiwen commented Dec 7, 2022

Uh oh!

Xia-Weiwen commented Dec 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Xia-Weiwen Dec 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Dec 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Xia-Weiwen Dec 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Xia-Weiwen commented Dec 17, 2022

Uh oh!

pytorchmergebot commented Dec 17, 2022

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Xia-Weiwen commented Nov 8, 2022 •

edited

Loading

pytorch-bot bot commented Nov 8, 2022 •

edited

Loading

Xia-Weiwen commented Dec 9, 2022 •

edited

Loading

Xia-Weiwen Dec 12, 2022 •

edited

Loading

jerryzh168 Dec 16, 2022 •

edited

Loading

Xia-Weiwen Dec 15, 2022 •

edited

Loading