[quant][fx] Fix dynamic weighted op lowering when input is used multiple times #74364

jerryzh168 · 2022-03-17T05:18:17Z

Stack from ghstack (oldest at bottom):

[quant] Remove assert for weight since it could be non-Tensor #74365
-> [quant][fx] Fix dynamic weighted op lowering when input is used multiple times #74364
[quant][fx] Only do reference moduel swapping for floating point fused modules #74231

Summary:
if a input is used multiple times in modules that are dynamically quantized:

x -- linear1
  \-- linear2

we'll insert quantize_per_tensor_dynamic and dequantize for input, and we'll have a duplicate pass
to duplicate dequantize ops for pattern matching:

x - quantize_per_tensor_dynamic - dequantize1 - linear1
                     \----- dequantize2 - linear2

But we also have a check in the lowering code that if quantize_per_tensor_dynamic is used by multiple nodes
we'll skip the pattern, so the pattern is not recognized, we need to duplicate quantize_per_tensor_dynamic as well in this case
to recover both patterns:

x - quantize_per_tensor_dynamic1 -- dequantize1 -- linear1
   \- quantize_per-tensor_dynamic2 -- dequantize2 -- linear2

so that they can be fused into dynamic linear:

x - linear_dynamic1
\-- linear_dynamic2

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_dynamic_linear_input_multiple_use

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D34952755

…ple times Summary: if a input is used multiple times in modules that are dynamically quantized: ``` x -- linear1 \-- linear2 ``` we'll insert quantize_per_tensor_dynamic and dequantize for input, and we'll have a duplicate pass to duplicate dequantize ops for pattern matching: ``` x - quantize_per_tensor_dynamic - dequantize1 - linear1 \----- dequantize2 - linear2 ``` But we also have a check in the lowering code that if quantize_per_tensor_dynamic is used by multiple nodes we'll skip the pattern, so the pattern is not recognized, we need to duplicate quantize_per_tensor_dynamic as well in this case to recover both patterns: ``` x - quantize_per_tensor_dynamic1 -- dequantize1 -- linear1 \- quantize_per-tensor_dynamic2 -- dequantize2 -- linear2 ``` so that they can be fused into dynamic linear: ``` x - linear_dynamic1 \-- linear_dynamic2 ``` Test Plan: python test/test_quantization.py TestQuantizeFx.test_dynamic_linear_input_multiple_use Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

pytorch-bot · 2022-03-17T05:18:21Z

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/4910aa12640408fc3ab97f04882446e46a9e977b/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows	Labels (bold enabled)	Status
Triggered Workflows
linux-binary-conda	`ciflow/binaries`, `ciflow/binaries_conda`, `ciflow/default`	✅ triggered
linux-binary-libtorch-cxx11-abi	`ciflow/all`, `ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`, `ciflow/trunk`	✅ triggered
linux-binary-libtorch-pre-cxx11	`ciflow/all`, `ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`, `ciflow/trunk`	✅ triggered
linux-binary-manywheel	`ciflow/all`, `ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`, `ciflow/trunk`	✅ triggered
linux-bionic-py3.7-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/noarch`, `ciflow/trunk`	✅ triggered
linux-bionic-rocm4.5-py3.7	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/rocm`, `ciflow/trunk`	✅ triggered
linux-docs	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/docs`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-vulkan-bionic-py3.7-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`, `ciflow/vulkan`	✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test	`ciflow/all`, `ciflow/bazel`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3-clang5-mobile-build	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-clang7-asan	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/sanitizers`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-clang7-onnx	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/onnx`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc5.4-mobile-lightweight-dispatch-build	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc7	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc7-no-ops	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
macos-arm64-binary-conda	`ciflow/binaries`, `ciflow/binaries_conda`, `ciflow/default`	✅ triggered
macos-arm64-binary-wheel	`ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`	✅ triggered
macos-binary-conda	`ciflow/binaries`, `ciflow/binaries_conda`, `ciflow/default`	✅ triggered
macos-binary-libtorch-cxx11-abi	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
macos-binary-libtorch-pre-cxx11	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
macos-binary-wheel	`ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
win-vs2019-cpu-py3	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/trunk`, `ciflow/win`	✅ triggered
win-vs2019-cuda11.3-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/trunk`, `ciflow/win`	✅ triggered
windows-binary-conda	`ciflow/binaries`, `ciflow/binaries_conda`, `ciflow/default`	✅ triggered
windows-binary-libtorch-debug	`ciflow/all`, `ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`, `ciflow/trunk`	✅ triggered
windows-binary-libtorch-release	`ciflow/all`, `ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`, `ciflow/trunk`	✅ triggered
windows-binary-wheel	`ciflow/all`, `ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`, `ciflow/trunk`	✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
docker-builds	`ciflow/all`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/scheduled`	🚫 skipped
ios-12-5-1-arm64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/scheduled`	🚫 skipped
ios-12-5-1-arm64-custom-ops	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/scheduled`	🚫 skipped
ios-12-5-1-arm64-metal	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/scheduled`	🚫 skipped
ios-12-5-1-x86-64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-x86-64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/slow`, `ciflow/trunk`	🚫 skipped
linux-bionic-rocm4.5-py3.7-distributed	`ciflow/all`, `ciflow/linux`, `ciflow/rocm`, `ciflow/trunk`	🚫 skipped
linux-docs-push	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
macos-10-15-py3-arm64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
macos-11-py3-x86-64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`, `ciflow/slow`, `ciflow/slow-gradcheck`	🚫 skipped
periodic-linux-xenial-cuda11.3-py3.7-gcc7-debug	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-win-vs2019-cuda11.5-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/scheduled`, `ciflow/win`	🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
pytorch-xla-linux-bionic-py3.7-clang8	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`, `ciflow/xla`	🚫 skipped

facebook-github-bot · 2022-03-17T05:18:22Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/74364
📄 Preview docs built from this PR
📄 Preview C++ docs built from this PR
🔧 Opt-in to CIFlow to control what jobs run on your PRs

💊 CI failures summary and remediations

As of commit 27a8e0d (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

jerryzh168 · 2022-03-17T05:20:51Z

@jerryzh168 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

… used multiple times" Summary: if a input is used multiple times in modules that are dynamically quantized: ``` x -- linear1 \-- linear2 ``` we'll insert quantize_per_tensor_dynamic and dequantize for input, and we'll have a duplicate pass to duplicate dequantize ops for pattern matching: ``` x - quantize_per_tensor_dynamic - dequantize1 - linear1 \----- dequantize2 - linear2 ``` But we also have a check in the lowering code that if quantize_per_tensor_dynamic is used by multiple nodes we'll skip the pattern, so the pattern is not recognized, we need to duplicate quantize_per_tensor_dynamic as well in this case to recover both patterns: ``` x - quantize_per_tensor_dynamic1 -- dequantize1 -- linear1 \- quantize_per-tensor_dynamic2 -- dequantize2 -- linear2 ``` so that they can be fused into dynamic linear: ``` x - linear_dynamic1 \-- linear_dynamic2 ``` Test Plan: python test/test_quantization.py TestQuantizeFx.test_dynamic_linear_input_multiple_use Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D34952755](https://our.internmc.facebook.com/intern/diff/D34952755) [ghstack-poisoned]

jerryzh168 · 2022-03-17T06:42:18Z

@jerryzh168 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

andrewor14 · 2022-03-17T16:06:50Z

torch/ao/quantization/fx/_convert_do_not_use.py

            # run the weight observer
            weight_observer_module()

+# this method is temporary will be removed soon


Is the real fix removing the check for whether dequantize nodes have multiple users in the lowering code? Is there a reason why we don't just directly change that instead of adding a temporary fix?

yes, the real fix is to stop duplicating dequantize and remove the check, planning to talk to you in today's sync

… used multiple times" Summary: if a input is used multiple times in modules that are dynamically quantized: ``` x -- linear1 \-- linear2 ``` we'll insert quantize_per_tensor_dynamic and dequantize for input, and we'll have a duplicate pass to duplicate dequantize ops for pattern matching: ``` x - quantize_per_tensor_dynamic - dequantize1 - linear1 \----- dequantize2 - linear2 ``` But we also have a check in the lowering code that if quantize_per_tensor_dynamic is used by multiple nodes we'll skip the pattern, so the pattern is not recognized, we need to duplicate quantize_per_tensor_dynamic as well in this case to recover both patterns: ``` x - quantize_per_tensor_dynamic1 -- dequantize1 -- linear1 \- quantize_per-tensor_dynamic2 -- dequantize2 -- linear2 ``` so that they can be fused into dynamic linear: ``` x - linear_dynamic1 \-- linear_dynamic2 ``` Test Plan: python test/test_quantization.py TestQuantizeFx.test_dynamic_linear_input_multiple_use Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D34952755](https://our.internmc.facebook.com/intern/diff/D34952755) [ghstack-poisoned]

jerryzh168 · 2022-03-17T19:48:16Z

@jerryzh168 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…ple times (#74364) Summary: Pull Request resolved: #74364 if a input is used multiple times in modules that are dynamically quantized: ``` x -- linear1 \-- linear2 ``` we'll insert quantize_per_tensor_dynamic and dequantize for input, and we'll have a duplicate pass to duplicate dequantize ops for pattern matching: ``` x - quantize_per_tensor_dynamic - dequantize1 - linear1 \----- dequantize2 - linear2 ``` But we also have a check in the lowering code that if quantize_per_tensor_dynamic is used by multiple nodes we'll skip the pattern, so the pattern is not recognized, we need to duplicate quantize_per_tensor_dynamic as well in this case to recover both patterns: ``` x - quantize_per_tensor_dynamic1 -- dequantize1 -- linear1 \- quantize_per-tensor_dynamic2 -- dequantize2 -- linear2 ``` so that they can be fused into dynamic linear: ``` x - linear_dynamic1 \-- linear_dynamic2 ``` Test Plan: python test/test_quantization.py TestQuantizeFx.test_dynamic_linear_input_multiple_use Imported from OSS Reviewed By: yixin94 Differential Revision: D34952755 fbshipit-source-id: a950159fd6a661e84faf0baf1692f6783904cfb3

…ple times (#74364) Summary: Pull Request resolved: #74364 if a input is used multiple times in modules that are dynamically quantized: ``` x -- linear1 \-- linear2 ``` we'll insert quantize_per_tensor_dynamic and dequantize for input, and we'll have a duplicate pass to duplicate dequantize ops for pattern matching: ``` x - quantize_per_tensor_dynamic - dequantize1 - linear1 \----- dequantize2 - linear2 ``` But we also have a check in the lowering code that if quantize_per_tensor_dynamic is used by multiple nodes we'll skip the pattern, so the pattern is not recognized, we need to duplicate quantize_per_tensor_dynamic as well in this case to recover both patterns: ``` x - quantize_per_tensor_dynamic1 -- dequantize1 -- linear1 \- quantize_per-tensor_dynamic2 -- dequantize2 -- linear2 ``` so that they can be fused into dynamic linear: ``` x - linear_dynamic1 \-- linear_dynamic2 ``` Test Plan: python test/test_quantization.py TestQuantizeFx.test_dynamic_linear_input_multiple_use Imported from OSS Reviewed By: yixin94 Differential Revision: D34952755 fbshipit-source-id: a950159fd6a661e84faf0baf1692f6783904cfb3 (cherry picked from commit 8a68968)

pytorch-bot bot added the ciflow/default label Mar 17, 2022

facebook-github-bot added the cla signed label Mar 17, 2022

This was referenced Mar 17, 2022

[quant][fx] Only do reference moduel swapping for floating point fused modules #74231

Closed

[quant] Remove assert for weight since it could be non-Tensor #74365

Closed

facebook-github-bot added the module: fx label Mar 17, 2022

jerryzh168 requested review from andrewor14 and vkuzo March 17, 2022 06:22

jerryzh168 added 2 commits March 16, 2022 23:23

andrewor14 approved these changes Mar 17, 2022

View reviewed changes

jerryzh168 added 2 commits March 17, 2022 12:39

jerryzh168 added release notes: quantization release notes category topic: bug fixes topic category labels Mar 18, 2022

pytorchmergebot closed this in b86554a Mar 18, 2022

facebook-github-bot deleted the gh/jerryzh168/747/head branch March 22, 2022 14:17

WBobby mentioned this pull request Aug 17, 2022

Add ROCm5.2.3/AMDGPU support for PyTorch WBobby/pytorch#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[quant][fx] Fix dynamic weighted op lowering when input is used multiple times #74364

[quant][fx] Fix dynamic weighted op lowering when input is used multiple times #74364

Uh oh!

jerryzh168 commented Mar 17, 2022 •

edited

Loading

Uh oh!

pytorch-bot bot commented Mar 17, 2022

⚛️ CI Flow

Uh oh!

facebook-github-bot commented Mar 17, 2022 •

edited

Loading

Uh oh!

jerryzh168 commented Mar 17, 2022

Uh oh!

jerryzh168 commented Mar 17, 2022

Uh oh!

andrewor14 Mar 17, 2022

Uh oh!

jerryzh168 Mar 17, 2022 •

edited

Loading

Uh oh!

jerryzh168 commented Mar 17, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[quant][fx] Fix dynamic weighted op lowering when input is used multiple times #74364

[quant][fx] Fix dynamic weighted op lowering when input is used multiple times #74364

Uh oh!

Conversation

jerryzh168 commented Mar 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 17, 2022

⚛️ CI Flow

Uh oh!

facebook-github-bot commented Mar 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

💊 CI failures summary and remediations

Uh oh!

jerryzh168 commented Mar 17, 2022

Uh oh!

jerryzh168 commented Mar 17, 2022

Uh oh!

andrewor14 Mar 17, 2022

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Mar 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 commented Mar 17, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jerryzh168 commented Mar 17, 2022 •

edited

Loading

facebook-github-bot commented Mar 17, 2022 •

edited

Loading

jerryzh168 Mar 17, 2022 •

edited

Loading