fix mkldnn quantization issue for weight reorder error #86876

XiaobingSuper · 2022-10-13T03:17:11Z

Stack from ghstack (oldest at bottom):

-> fix mkldnn quantization issue for weight reorder error #86876

Differential Revision: D40351062

For mkldnn quantization path, we will do weight prepack using dummy data to query the expected weight format, the packed weight's format may differ from the real input case(the weight format depends on the input's shape), and there will have a block weight to block weight reorder if the packed weight format differs with the expected weight format. The mkldnn may meet the following issue when doing such reorder(test on ICX machine):

test_conv_reorder_issue_onednn
    torch.ops.quantized.conv2d(qx, w_packed, output_scale=1.0, output_zero_point=0)
  File "/home/weiwen/.conda/envs/int8-dev/lib/python3.9/site-packages/torch/_ops.py", line 472, in __call__
    return self._op(*args, **kwargs or {})
RuntimeError: could not create a primitive descriptor for a reorder primitive

This PR will fix it: if the block weight to block weight reorder is failed, we will reorder the block weight to plain weight first, and then reorder the plain weight to the target block weight.

cc @jerryzh168 @jianyuh @raghuramank100 @jamesr66a @vkuzo @jgong5 @Xia-Weiwen @leslie-fang-intel @VitalyFedyunin @mingfeima @sanchitintel @ashokei @jingxu10

[ghstack-poisoned]

pytorch-bot · 2022-10-13T03:17:13Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/86876

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 5929579:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: cd377dd Pull Request resolved: #86876

jerryzh168 · 2022-10-13T17:15:23Z

@jerryzh168 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

jerryzh168 · 2022-11-10T01:18:36Z

@XiaobingSuper I have confirmed with my colleague that we can land this first, could you publish this draft PR?

jerryzh168 · 2022-11-10T01:19:37Z

could you 1. clean up the fix 2. split the changes to default to a separate PR?

Differential Revision: [D40351062](https://our.internmc.facebook.com/intern/diff/D40351062) [ghstack-poisoned]

XiaobingSuper · 2022-11-10T07:15:55Z

could you 1. clean up the fix 2. split the changes to default to a separate PR?

Done, please help review it.

jgong5

The change LGTM but could you please add the description the suspected issue and how the change addresses it?

XiaobingSuper · 2022-11-11T03:24:35Z

@pytorchbot merge

pytorchmergebot · 2022-11-11T03:26:09Z

Merge failed

Reason: This PR has internal changes and must be landed via Phabricator

Details for Dev Infra team

Raised by workflow job

jerryzh168 · 2022-11-11T05:27:22Z

@jerryzh168 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

XiaobingSuper · 2022-11-15T01:11:53Z

@jerryzh168 , could you help see the internal failed test case? I can't access it.

Differential Revision: [D40351062](https://our.internmc.facebook.com/intern/diff/D40351062) There has a potential issue when reordering block to block, now, if it failed, we will fist reorder to plain, and then reorder to the target block. cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo jgong5 Xia-Weiwen leslie-fang-intel VitalyFedyunin mingfeima sanchitintel ashokei jingxu10 [ghstack-poisoned]

XiaobingSuper · 2022-11-21T02:43:51Z

reproduce

A test case is added. @@jerryzh168

vkuzo · 2022-11-21T15:41:54Z

Could we add some more context on what the issue was, which hardware the issue affects, and how this PR is fixing it? Also, to clarify, the test fails before this PR and passes after?

Differential Revision: [D40351062](https://our.internmc.facebook.com/intern/diff/D40351062) For mkldnn quantization path, we will do weight prepack using dummy data to query the expected weight format, the packed weight's format may differ from the real input case(the weight format depends on the input's shape), and there will have a block weight to block weight reorder if the packed weight format differs with the expected weight format. The mkldnn may meet the following issue when doing such reorder(test on ICX machine): ``` test_conv_reorder_issue_onednn torch.ops.quantized.conv2d(qx, w_packed, output_scale=1.0, output_zero_point=0) File "/home/weiwen/.conda/envs/int8-dev/lib/python3.9/site-packages/torch/_ops.py", line 472, in __call__ return self._op(*args, **kwargs or {}) RuntimeError: could not create a primitive descriptor for a reorder primitive ``` This PR will fix it: if the block weight to block weight reorder is failed, we will reorder the block weight to plain weight first, and then reorder the plain weight to the target block weight. cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo jgong5 Xia-Weiwen leslie-fang-intel VitalyFedyunin mingfeima sanchitintel ashokei jingxu10 [ghstack-poisoned]

XiaobingSuper · 2022-11-22T01:23:49Z

Could we add some more context on what the issue was, which hardware the issue affects, and how this PR is fixing it? Also, to clarify, the test fails before this PR and passes after?

@vkuzo , I add more context to descript the issue, please help see it.

XiaobingSuper · 2022-11-22T12:57:40Z

@jerryzh168, could we land it?

jerryzh168 · 2022-11-23T21:04:12Z

sure, will try landing this again

jerryzh168 · 2022-11-23T21:17:44Z

@jerryzh168 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

jerryzh168 · 2022-11-29T00:42:17Z

@XiaobingSuper could you rebase the PR on master?

Differential Revision: [D40351062](https://our.internmc.facebook.com/intern/diff/D40351062) For mkldnn quantization path, we will do weight prepack using dummy data to query the expected weight format, the packed weight's format may differ from the real input case(the weight format depends on the input's shape), and there will have a block weight to block weight reorder if the packed weight format differs with the expected weight format. The mkldnn may meet the following issue when doing such reorder(test on ICX machine): ``` test_conv_reorder_issue_onednn torch.ops.quantized.conv2d(qx, w_packed, output_scale=1.0, output_zero_point=0) File "/home/weiwen/.conda/envs/int8-dev/lib/python3.9/site-packages/torch/_ops.py", line 472, in __call__ return self._op(*args, **kwargs or {}) RuntimeError: could not create a primitive descriptor for a reorder primitive ``` This PR will fix it: if the block weight to block weight reorder is failed, we will reorder the block weight to plain weight first, and then reorder the plain weight to the target block weight. cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo jgong5 Xia-Weiwen leslie-fang-intel VitalyFedyunin mingfeima sanchitintel ashokei jingxu10 [ghstack-poisoned]

XiaobingSuper · 2022-11-29T01:24:51Z

@XiaobingSuper could you rebase the PR on master?

rebased.

jerryzh168 · 2022-11-29T04:39:26Z

@jerryzh168 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

XiaobingSuper · 2022-11-30T00:57:06Z

@jerryzh168 , please help land it if it is ok for you.

jerryzh168 · 2022-11-30T01:02:47Z

@pytorchbot rebase

pytorchmergebot · 2022-11-30T01:04:55Z

@pytorchbot successfully started a rebase job. Check the current status here

Differential Revision: [D40351062](https://our.internmc.facebook.com/intern/diff/D40351062) For mkldnn quantization path, we will do weight prepack using dummy data to query the expected weight format, the packed weight's format may differ from the real input case(the weight format depends on the input's shape), and there will have a block weight to block weight reorder if the packed weight format differs with the expected weight format. The mkldnn may meet the following issue when doing such reorder(test on ICX machine): ``` test_conv_reorder_issue_onednn torch.ops.quantized.conv2d(qx, w_packed, output_scale=1.0, output_zero_point=0) File "/home/weiwen/.conda/envs/int8-dev/lib/python3.9/site-packages/torch/_ops.py", line 472, in __call__ return self._op(*args, **kwargs or {}) RuntimeError: could not create a primitive descriptor for a reorder primitive ``` This PR will fix it: if the block weight to block weight reorder is failed, we will reorder the block weight to plain weight first, and then reorder the plain weight to the target block weight. cc jerryzh168 jianyuh raghuramank100 jamesr66a vkuzo jgong5 Xia-Weiwen leslie-fang-intel VitalyFedyunin mingfeima sanchitintel ashokei jingxu10 [ghstack-poisoned]

pytorchmergebot · 2022-11-30T01:05:14Z

Successfully rebased gh/XiaobingSuper/17/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/86876)

ghstack-source-id: 0105c99 Pull Request resolved: #86876

jerryzh168 · 2022-11-30T01:18:49Z

@jerryzh168 , please help land it if it is ok for you.

trying to land, just need to make sure CI is green

jerryzh168 · 2022-11-30T03:27:11Z

@jerryzh168 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2022-12-01T01:58:36Z

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

pytorchmergebot · 2022-12-01T02:00:21Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Differential Revision: [D40351062](https://our.internmc.facebook.com/intern/diff/D40351062) For mkldnn quantization path, we will do weight prepack using dummy data to query the expected weight format, the packed weight's format may differ from the real input case(the weight format depends on the input's shape), and there will have a block weight to block weight reorder if the packed weight format differs with the expected weight format. The mkldnn may meet the following issue when doing such reorder(test on ICX machine): ``` test_conv_reorder_issue_onednn torch.ops.quantized.conv2d(qx, w_packed, output_scale=1.0, output_zero_point=0) File "/home/weiwen/.conda/envs/int8-dev/lib/python3.9/site-packages/torch/_ops.py", line 472, in __call__ return self._op(*args, **kwargs or {}) RuntimeError: could not create a primitive descriptor for a reorder primitive ``` This PR will fix it: if the block weight to block weight reorder is failed, we will reorder the block weight to plain weight first, and then reorder the plain weight to the target block weight. Pull Request resolved: pytorch#86876 Approved by: https://github.com/jgong5, https://github.com/jerryzh168

fix mkldnn quantization issue for weight reorder error

08c0d4b

[ghstack-poisoned]

XiaobingSuper requested review from digantdesai, jerryzh168, jianyuh, kimishpatel, salilsdesai and z-a-f as code owners October 13, 2022 03:17

pytorch-bot bot added the release notes: quantization release notes category label Oct 13, 2022

XiaobingSuper added a commit that referenced this pull request Oct 13, 2022

fix mkldnn quantization issue for weight reorder error

67d6941

ghstack-source-id: cd377dd Pull Request resolved: #86876

XiaobingSuper marked this pull request as draft October 13, 2022 03:21

pytorchbot added the open source label Oct 13, 2022

jerryzh168 mentioned this pull request Oct 14, 2022

update quantization tutorial by introudcing x86 backend pytorch/tutorials#2081

Merged

jgong5 mentioned this pull request Oct 14, 2022

update quantization doc: add x86 backend as default backend of server inference #86900

Closed

Update on "fix mkldnn quantization issue for weight reorder error"

2c972ab

Differential Revision: [D40351062](https://our.internmc.facebook.com/intern/diff/D40351062) [ghstack-poisoned]

XiaobingSuper mentioned this pull request Nov 10, 2022

quantization: make x86 as default backend (part 1) #88799

Closed

XiaobingSuper marked this pull request as ready for review November 10, 2022 05:45

github-actions bot added module: cpu CPU specific problem (e.g., perf, algorithm) oncall: quantization Quantization support in PyTorch labels Nov 10, 2022

jgong5 approved these changes Nov 11, 2022

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 11, 2022

XiaobingSuper added the intel priority matters to intel architecture from performance wise label Nov 15, 2022

jerryzh168 approved these changes Nov 23, 2022

View reviewed changes

pytorchmergebot pushed a commit that referenced this pull request Nov 30, 2022

fix mkldnn quantization issue for weight reorder error

886cb20

ghstack-source-id: 0105c99 Pull Request resolved: #86876

XiaobingSuper requested a review from jerryzh168 December 1, 2022 01:13

pytorchmergebot added the Merged label Dec 1, 2022

pytorchmergebot closed this in 0e7918b Dec 1, 2022

fzhao3 mentioned this pull request Jan 13, 2023

[RFC] Unified quantization backend for x86 CPU platforms #83888

Closed

facebook-github-bot deleted the gh/XiaobingSuper/17/head branch June 8, 2023 15:00

fix mkldnn quantization issue for weight reorder error #86876

fix mkldnn quantization issue for weight reorder error #86876

Uh oh!

Conversation

XiaobingSuper commented Oct 13, 2022 • edited by pytorchmergebot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/86876

✅ No Failures

Uh oh!

jerryzh168 commented Oct 13, 2022

Uh oh!

jerryzh168 commented Nov 10, 2022

Uh oh!

jerryzh168 commented Nov 10, 2022

Uh oh!

XiaobingSuper commented Nov 10, 2022

Uh oh!

jgong5 left a comment

Choose a reason for hiding this comment

Uh oh!

XiaobingSuper commented Nov 11, 2022

Uh oh!

pytorchmergebot commented Nov 11, 2022

Merge failed

Uh oh!

jerryzh168 commented Nov 11, 2022

Uh oh!

XiaobingSuper commented Nov 15, 2022

Uh oh!

XiaobingSuper commented Nov 21, 2022

Uh oh!

vkuzo commented Nov 21, 2022

Uh oh!

XiaobingSuper commented Nov 22, 2022

Uh oh!

XiaobingSuper commented Nov 22, 2022

Uh oh!

jerryzh168 commented Nov 23, 2022

Uh oh!

jerryzh168 commented Nov 23, 2022

Uh oh!

jerryzh168 commented Nov 29, 2022

Uh oh!

XiaobingSuper commented Nov 29, 2022

Uh oh!

jerryzh168 commented Nov 29, 2022

Uh oh!

XiaobingSuper commented Nov 30, 2022

Uh oh!

jerryzh168 commented Nov 30, 2022

Uh oh!

pytorchmergebot commented Nov 30, 2022

Uh oh!

pytorchmergebot commented Nov 30, 2022

Uh oh!

jerryzh168 commented Nov 30, 2022

Uh oh!

jerryzh168 commented Nov 30, 2022

Uh oh!

facebook-github-bot commented Dec 1, 2022

Uh oh!

pytorchmergebot commented Dec 1, 2022

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

XiaobingSuper commented Oct 13, 2022 •

edited by pytorchmergebot

Loading

pytorch-bot bot commented Oct 13, 2022 •

edited

Loading