[Quant][core][gpu][improvement] Refactored implementation for conv2d_cudnn to use packed parameters #73510

dzdang · 2022-02-28T15:51:55Z

Stack from ghstack (oldest at bottom):

-> [Quant][core][gpu][improvement] Refactored implementation for conv2d_cudnn to use packed parameters #73510

Summary:
The previous implementation introduced in #70622
and expanded on in #72770,
#73035, #73337
did not make use of packed parameters. This PR refactors the existing
implementation to use packed parameters for cudnn conv2d in the same manner
as was done for qnnpack and fbgemm in the following files:
aten/src/ATen/native/quantized/cpu/fbgemm_utils.h.
aten/src/ATen/native/quantized/cpu/qnnpack_utils.h.
aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp.
aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be
refactored into two files (one located in /quantized/ and the other in
/quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced
in this file for the cudnn operator as well)

This allows for all cudnn operators to be registered as quantized::conv2d,
quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher
to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack).

Test cases were also modified to adhere to the methodology of using
prepacking the weight & bias prior to passing it into the conv2d operator.

We also ensured that the refactorization did not result in a reduction in speed
by verifying that the computation times in the benchmark test case (see test plan below)
are consistent with the results pre-refactorization.

Note the following:
apply_impl is now what was formerly raw_cudnn_convolution_forward
apply_impl_helper is now what was formerly raw_cudnn_convolution_forward_out

Test plan:
In pytorch main directory, execute

python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn

for accuracy testing and

python test/test_quantization.py TestQuantizedConv.test_benchmark

for benchmark testing.

Differential Revision: D34803275

[ghstack-poisoned]

pytorch-bot · 2022-02-28T15:52:00Z

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/c751a5d9ea52faad84699186573126c90218ecf3/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows	Labels (bold enabled)	Status
Triggered Workflows
linux-binary-conda	`ciflow/binaries`, `ciflow/binaries_conda`, `ciflow/default`	✅ triggered
linux-binary-libtorch-cxx11-abi	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
linux-binary-libtorch-pre-cxx11	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
linux-binary-manywheel	`ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`	✅ triggered
linux-bionic-py3.7-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/noarch`, `ciflow/trunk`	✅ triggered
linux-bionic-rocm4.5-py3.7	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/rocm`, `ciflow/trunk`	✅ triggered
linux-docs	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/docs`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-vulkan-bionic-py3.7-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`, `ciflow/vulkan`	✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test	`ciflow/all`, `ciflow/bazel`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3-clang5-mobile-build	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-clang7-asan	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/sanitizers`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-clang7-onnx	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/onnx`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc7	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc7-no-ops	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
macos-arm64-binary-conda	`ciflow/binaries`, `ciflow/binaries_conda`, `ciflow/default`	✅ triggered
macos-arm64-binary-wheel	`ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`	✅ triggered
macos-binary-conda	`ciflow/binaries`, `ciflow/binaries_conda`, `ciflow/default`	✅ triggered
macos-binary-libtorch-cxx11-abi	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
macos-binary-libtorch-pre-cxx11	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
macos-binary-wheel	`ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
win-vs2019-cpu-py3	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/trunk`, `ciflow/win`	✅ triggered
win-vs2019-cuda11.3-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/trunk`, `ciflow/win`	✅ triggered
windows-binary-libtorch-cxx11-abi	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
windows-binary-libtorch-pre-cxx11	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
windows-binary-wheel	`ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`	✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
docker-builds	`ciflow/all`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/scheduled`	🚫 skipped
ios-12-5-1-arm64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/scheduled`	🚫 skipped
ios-12-5-1-arm64-custom-ops	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/scheduled`	🚫 skipped
ios-12-5-1-arm64-metal	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/scheduled`	🚫 skipped
ios-12-5-1-x86-64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-x86-64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/slow`, `ciflow/trunk`	🚫 skipped
linux-docs-push	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
macos-10-15-py3-arm64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
macos-11-py3-x86-64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`, `ciflow/slow`, `ciflow/slow-gradcheck`	🚫 skipped
periodic-linux-xenial-cuda11.3-py3.7-gcc7-debug	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-win-vs2019-cuda11.5-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/scheduled`, `ciflow/win`	🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
pytorch-xla-linux-bionic-py3.7-clang8	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`, `ciflow/xla`	🚫 skipped

facebook-github-bot · 2022-02-28T15:52:02Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/73510
📄 Preview docs built from this PR
📄 Preview C++ docs built from this PR
↩️ [fb-only] Re-run with SSH instructions
🔧 Opt-in to CIFlow to control what jobs run on your PRs

💊 CI failures summary and remediations

As of commit 8e973a0 (more details on the Dr. CI page):

11/11 failures introduced in this PR

🕵️ 11 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

linux-xenial-py3.7-gcc5.4 / test (default, 2, 2, linux.2xlarge) (1/11)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-15T05:01:11.1020783Z FAIL [0.148s]: tes...uantization.fx.test_quantize_fx.TestQuantizeFxOps)

2022-03-15T05:01:10.7991348Z   test_subgraph_rewriter_placeholder_matching (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter)
2022-03-15T05:01:10.8384784Z This tests that a placeholder Node can be matched to a Node with ... ok (0.040s)
2022-03-15T05:01:10.8783266Z   test_subgraph_rewriter_preserves_logic (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.040s)
2022-03-15T05:01:10.9096632Z   test_subgraph_rewriter_replaces_referenced_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.031s)
2022-03-15T05:01:10.9476809Z   test_subgraph_rewriter_single_pattern_match (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.038s)
2022-03-15T05:01:11.0162842Z   test_subgraph_rewriter_traced_as_callable (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.069s)
2022-03-15T05:01:11.0610246Z   test_subgraph_rewriter_with_oneliner_pattern (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.045s)
2022-03-15T05:01:11.1020025Z   test_subgraph_writer_replace_consecutive_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.041s)
2022-03-15T05:01:11.1020402Z 
2022-03-15T05:01:11.1020499Z ======================================================================
2022-03-15T05:01:11.1020783Z FAIL [0.148s]: test_fixed_qparams_ops (quantization.fx.test_quantize_fx.TestQuantizeFxOps)
2022-03-15T05:01:11.1021282Z ----------------------------------------------------------------------
2022-03-15T05:01:11.1021841Z Traceback (most recent call last):
2022-03-15T05:01:11.1022586Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_quantization.py", line 278, in wrapper
2022-03-15T05:01:11.1023399Z     fn(*args, **kwargs)
2022-03-15T05:01:11.1023900Z   File "/var/lib/jenkins/workspace/test/quantization/fx/test_quantize_fx.py", line 5572, in test_fixed_qparams_ops
2022-03-15T05:01:11.1024337Z     expected_node_list=reference_order_check)
2022-03-15T05:01:11.1025000Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_quantization.py", line 613, in checkGraphModuleNodes
2022-03-15T05:01:11.1025749Z     ' Found occurrence:' + str(nodes_in_graph[expected_node]))
2022-03-15T05:01:11.1026905Z AssertionError: False is not true : Check failed for node:'call_function' <built-in method quantize_per_tensor of type object at 0x7f9b57bd9d00> Expected occurrence:13 Found occurrence:11
2022-03-15T05:01:11.1027471Z

win-vs2019-cuda11.3-py3 / test (force_on_cpu, 1, 1, windows.4xlarge) (2/11)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-15T07:08:46.2544739Z FAIL [0.161s]: tes...uantization.fx.test_quantize_fx.TestQuantizeFxOps)

2022-03-15T07:08:46.0376678Z   test_subgraph_rewriter_placeholder_matching (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter)
2022-03-15T07:08:46.0658541Z This tests that a placeholder Node can be matched to a Node with ... ok (0.035s)
2022-03-15T07:08:46.0962488Z   test_subgraph_rewriter_preserves_logic (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.018s)
2022-03-15T07:08:46.1217450Z   test_subgraph_rewriter_replaces_referenced_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.031s)
2022-03-15T07:08:46.1515190Z   test_subgraph_rewriter_single_pattern_match (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.031s)
2022-03-15T07:08:46.1951174Z   test_subgraph_rewriter_traced_as_callable (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.037s)
2022-03-15T07:08:46.2244686Z   test_subgraph_rewriter_with_oneliner_pattern (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.031s)
2022-03-15T07:08:46.2542405Z   test_subgraph_writer_replace_consecutive_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.032s)
2022-03-15T07:08:46.2543092Z 
2022-03-15T07:08:46.2543378Z ======================================================================
2022-03-15T07:08:46.2544739Z FAIL [0.161s]: test_fixed_qparams_ops (quantization.fx.test_quantize_fx.TestQuantizeFxOps)
2022-03-15T07:08:46.2545468Z ----------------------------------------------------------------------
2022-03-15T07:08:46.2545989Z Traceback (most recent call last):
2022-03-15T07:08:46.2547523Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_quantization.py", line 278, in wrapper
2022-03-15T07:08:46.2548202Z     fn(*args, **kwargs)
2022-03-15T07:08:46.2549100Z   File "C:\actions-runner\_work\pytorch\pytorch\test\quantization\fx\test_quantize_fx.py", line 5569, in test_fixed_qparams_ops
2022-03-15T07:08:46.2549840Z     self.checkGraphModuleNodes(
2022-03-15T07:08:46.2550877Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_quantization.py", line 609, in checkGraphModuleNodes
2022-03-15T07:08:46.2551793Z     self.assertTrue(
2022-03-15T07:08:46.2552609Z AssertionError: False is not true : Check failed for node:'call_function' <built-in method quantize_per_tensor of type object at 0x00007FFDA8EED7C0> Expected occurrence:13 Found occurrence:11
2022-03-15T07:08:46.2553275Z

linux-xenial-cuda11.3-py3.7-gcc7 / test (default, 2, 2, linux.4xlarge.nvidia.gpu) (3/11)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-15T05:47:18.5808672Z FAIL [0.202s]: tes...uantization.fx.test_quantize_fx.TestQuantizeFxOps)

2022-03-15T05:47:18.2672826Z   test_subgraph_rewriter_placeholder_matching (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter)
2022-03-15T05:47:18.3084824Z This tests that a placeholder Node can be matched to a Node with ... ok (0.043s)
2022-03-15T05:47:18.3519877Z   test_subgraph_rewriter_preserves_logic (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.043s)
2022-03-15T05:47:18.3881238Z   test_subgraph_rewriter_replaces_referenced_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.036s)
2022-03-15T05:47:18.4320255Z   test_subgraph_rewriter_single_pattern_match (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.044s)
2022-03-15T05:47:18.4949245Z   test_subgraph_rewriter_traced_as_callable (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.063s)
2022-03-15T05:47:18.5373631Z   test_subgraph_rewriter_with_oneliner_pattern (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.042s)
2022-03-15T05:47:18.5803078Z   test_subgraph_writer_replace_consecutive_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.043s)
2022-03-15T05:47:18.5804346Z 
2022-03-15T05:47:18.5804504Z ======================================================================
2022-03-15T05:47:18.5808672Z FAIL [0.202s]: test_fixed_qparams_ops (quantization.fx.test_quantize_fx.TestQuantizeFxOps)
2022-03-15T05:47:18.5809793Z ----------------------------------------------------------------------
2022-03-15T05:47:18.5810317Z Traceback (most recent call last):
2022-03-15T05:47:18.5810896Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_quantization.py", line 278, in wrapper
2022-03-15T05:47:18.5811287Z     fn(*args, **kwargs)
2022-03-15T05:47:18.5811925Z   File "/var/lib/jenkins/workspace/test/quantization/fx/test_quantize_fx.py", line 5572, in test_fixed_qparams_ops
2022-03-15T05:47:18.5812358Z     expected_node_list=reference_order_check)
2022-03-15T05:47:18.5812929Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_quantization.py", line 613, in checkGraphModuleNodes
2022-03-15T05:47:18.5813469Z     ' Found occurrence:' + str(nodes_in_graph[expected_node]))
2022-03-15T05:47:18.5814165Z AssertionError: False is not true : Check failed for node:'call_function' <built-in method quantize_per_tensor of type object at 0x7fc0ba0a3440> Expected occurrence:13 Found occurrence:11
2022-03-15T05:47:18.5814540Z

linux-bionic-py3.7-clang9 / test (default, 2, 2, linux.2xlarge) (4/11)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-15T05:05:21.4718593Z FAIL [0.148s]: tes...uantization.fx.test_quantize_fx.TestQuantizeFxOps)

2022-03-15T05:05:21.2107019Z   test_subgraph_rewriter_placeholder_matching (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter)
2022-03-15T05:05:21.2455334Z This tests that a placeholder Node can be matched to a Node with ... ok (0.036s)
2022-03-15T05:05:21.2817915Z   test_subgraph_rewriter_preserves_logic (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.036s)
2022-03-15T05:05:21.3111391Z   test_subgraph_rewriter_replaces_referenced_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.029s)
2022-03-15T05:05:21.3480240Z   test_subgraph_rewriter_single_pattern_match (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.037s)
2022-03-15T05:05:21.4004248Z   test_subgraph_rewriter_traced_as_callable (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.052s)
2022-03-15T05:05:21.4359851Z   test_subgraph_rewriter_with_oneliner_pattern (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.036s)
2022-03-15T05:05:21.4717782Z   test_subgraph_writer_replace_consecutive_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.036s)
2022-03-15T05:05:21.4718196Z 
2022-03-15T05:05:21.4718298Z ======================================================================
2022-03-15T05:05:21.4718593Z FAIL [0.148s]: test_fixed_qparams_ops (quantization.fx.test_quantize_fx.TestQuantizeFxOps)
2022-03-15T05:05:21.4719078Z ----------------------------------------------------------------------
2022-03-15T05:05:21.4719603Z Traceback (most recent call last):
2022-03-15T05:05:21.4720316Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_quantization.py", line 278, in wrapper
2022-03-15T05:05:21.4720755Z     fn(*args, **kwargs)
2022-03-15T05:05:21.4721061Z   File "/var/lib/jenkins/workspace/test/quantization/fx/test_quantize_fx.py", line 5572, in test_fixed_qparams_ops
2022-03-15T05:05:21.4721366Z     expected_node_list=reference_order_check)
2022-03-15T05:05:21.4723999Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_quantization.py", line 613, in checkGraphModuleNodes
2022-03-15T05:05:21.4724704Z     ' Found occurrence:' + str(nodes_in_graph[expected_node]))
2022-03-15T05:05:21.4725668Z AssertionError: False is not true : Check failed for node:'call_function' <built-in method quantize_per_tensor of type object at 0x7fa2fdef1b68> Expected occurrence:13 Found occurrence:11
2022-03-15T05:05:21.4725953Z

linux-xenial-py3.7-gcc5.4 / test (backwards_compat, 1, 1, linux.2xlarge) (5/11)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-15T04:50:52.2943376Z The PR is introduc...m to confirm whether this change is wanted or not.

2022-03-15T04:50:52.2931043Z processing existing schema:  text(__torch__.torch.classes.profiling.SourceRef _0) -> (str _0)
2022-03-15T04:50:52.2932307Z processing existing schema:  count(__torch__.torch.classes.profiling.InstructionStats _0) -> (int _0)
2022-03-15T04:50:52.2933618Z processing existing schema:  duration_ns(__torch__.torch.classes.profiling.InstructionStats _0) -> (int _0)
2022-03-15T04:50:52.2934613Z processing existing schema:  source(__torch__.torch.classes.profiling.SourceStats _0) -> (__torch__.torch.classes.profiling.SourceRef _0)
2022-03-15T04:50:52.2936240Z processing existing schema:  line_map(__torch__.torch.classes.profiling.SourceStats _0) -> (Dict(int, __torch__.torch.classes.profiling.InstructionStats) _0)
2022-03-15T04:50:52.2937150Z processing existing schema:  __init__(__torch__.torch.classes.profiling._ScriptProfile _0) -> (NoneType _0)
2022-03-15T04:50:52.2938396Z processing existing schema:  enable(__torch__.torch.classes.profiling._ScriptProfile _0) -> (NoneType _0)
2022-03-15T04:50:52.2939335Z processing existing schema:  disable(__torch__.torch.classes.profiling._ScriptProfile _0) -> (NoneType _0)
2022-03-15T04:50:52.2941233Z processing existing schema:  _dump_stats(__torch__.torch.classes.profiling._ScriptProfile _0) -> (__torch__.torch.classes.profiling.SourceStats[] _0)
2022-03-15T04:50:52.2942925Z processing existing schema:  __init__(__torch__.torch.classes.dist_rpc.WorkerInfo _0, str _1, int _2) -> (NoneType _0)
2022-03-15T04:50:52.2943376Z The PR is introducing backward incompatible changes to the operator library. Please contact PyTorch team to confirm whether this change is wanted or not. 
2022-03-15T04:50:52.2943394Z 
2022-03-15T04:50:52.2943517Z Broken ops: [
2022-03-15T04:50:52.2943931Z 	quantized::conv2d_relu_cudnn(Tensor act, Tensor weight, Tensor? bias, int[] stride, int[] padding, int[] dilation, int groups, float output_scale, int output_zero_point) -> (Tensor)
2022-03-15T04:50:52.2944288Z 	quantized::conv2d_cudnn(Tensor act, Tensor weight, Tensor? bias, int[] stride, int[] padding, int[] dilation, int groups, float output_scale, int output_zero_point) -> (Tensor)
2022-03-15T04:50:52.2944350Z ]
2022-03-15T04:50:52.3760945Z + cleanup
2022-03-15T04:50:52.3761110Z + retcode=1
2022-03-15T04:50:52.3761251Z + set +x
2022-03-15T04:50:52.3804115Z ##[error]Process completed with exit code 1.
2022-03-15T04:50:52.3838262Z ##[group]Run # Ensure the working directory gets chowned back to the current user

linux-bionic-rocm4.5-py3.7 / test (default, 1, 2, linux.rocm.gpu) (6/11)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-15T06:20:51.6779894Z FAIL [0.235s]: tes...uantization.fx.test_quantize_fx.TestQuantizeFxOps)

2022-03-15T06:20:51.4471476Z   test_subgraph_rewriter_placeholder_matching (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter)
2022-03-15T06:20:51.4777061Z This tests that a placeholder Node can be matched to a Node with ... ok (0.032s)
2022-03-15T06:20:51.5096217Z   test_subgraph_rewriter_preserves_logic (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.032s)
2022-03-15T06:20:51.5359601Z   test_subgraph_rewriter_replaces_referenced_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.026s)
2022-03-15T06:20:51.5677569Z   test_subgraph_rewriter_single_pattern_match (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.032s)
2022-03-15T06:20:51.6144176Z   test_subgraph_rewriter_traced_as_callable (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.047s)
2022-03-15T06:20:51.6457211Z   test_subgraph_rewriter_with_oneliner_pattern (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.031s)
2022-03-15T06:20:51.6776537Z   test_subgraph_writer_replace_consecutive_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.032s)
2022-03-15T06:20:51.6777480Z 
2022-03-15T06:20:51.6777792Z ======================================================================
2022-03-15T06:20:51.6779894Z FAIL [0.235s]: test_fixed_qparams_ops (quantization.fx.test_quantize_fx.TestQuantizeFxOps)
2022-03-15T06:20:51.6781629Z ----------------------------------------------------------------------
2022-03-15T06:20:51.6782525Z Traceback (most recent call last):
2022-03-15T06:20:51.6783940Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_quantization.py", line 278, in wrapper
2022-03-15T06:20:51.6784919Z     fn(*args, **kwargs)
2022-03-15T06:20:51.6785867Z   File "/var/lib/jenkins/pytorch/test/quantization/fx/test_quantize_fx.py", line 5572, in test_fixed_qparams_ops
2022-03-15T06:20:51.6786900Z     expected_node_list=reference_order_check)
2022-03-15T06:20:51.6788470Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_quantization.py", line 613, in checkGraphModuleNodes
2022-03-15T06:20:51.6790665Z     ' Found occurrence:' + str(nodes_in_graph[expected_node]))
2022-03-15T06:20:51.6794028Z AssertionError: False is not true : Check failed for node:'call_function' <built-in method quantize_per_tensor of type object at 0x7feb06f3ee20> Expected occurrence:13 Found occurrence:11
2022-03-15T06:20:51.6795217Z

win-vs2019-cuda11.3-py3 / test (default, 2, 2, windows.8xlarge.nvidia.gpu) (7/11)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-15T06:18:28.0997024Z FAIL [0.222s]: tes...uantization.fx.test_quantize_fx.TestQuantizeFxOps)

2022-03-15T06:18:27.8541108Z   test_subgraph_rewriter_placeholder_matching (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter)
2022-03-15T06:18:27.8857808Z This tests that a placeholder Node can be matched to a Node with ... ok (0.023s)
2022-03-15T06:18:27.9201446Z   test_subgraph_rewriter_preserves_logic (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.031s)
2022-03-15T06:18:27.9491682Z   test_subgraph_rewriter_replaces_referenced_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.031s)
2022-03-15T06:18:27.9826220Z   test_subgraph_rewriter_single_pattern_match (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.036s)
2022-03-15T06:18:28.0317926Z   test_subgraph_rewriter_traced_as_callable (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.047s)
2022-03-15T06:18:28.0650153Z   test_subgraph_rewriter_with_oneliner_pattern (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.031s)
2022-03-15T06:18:28.0994902Z   test_subgraph_writer_replace_consecutive_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.036s)
2022-03-15T06:18:28.0995899Z 
2022-03-15T06:18:28.0996242Z ======================================================================
2022-03-15T06:18:28.0997024Z FAIL [0.222s]: test_fixed_qparams_ops (quantization.fx.test_quantize_fx.TestQuantizeFxOps)
2022-03-15T06:18:28.0998196Z ----------------------------------------------------------------------
2022-03-15T06:18:28.0999010Z Traceback (most recent call last):
2022-03-15T06:18:28.1001329Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_quantization.py", line 278, in wrapper
2022-03-15T06:18:28.1002243Z     fn(*args, **kwargs)
2022-03-15T06:18:28.1003154Z   File "C:\actions-runner\_work\pytorch\pytorch\test\quantization\fx\test_quantize_fx.py", line 5569, in test_fixed_qparams_ops
2022-03-15T06:18:28.1004143Z     self.checkGraphModuleNodes(
2022-03-15T06:18:28.1005532Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_quantization.py", line 609, in checkGraphModuleNodes
2022-03-15T06:18:28.1006574Z     self.assertTrue(
2022-03-15T06:18:28.1007692Z AssertionError: False is not true : Check failed for node:'call_function' <built-in method quantize_per_tensor of type object at 0x00007FFF1C3BD7C0> Expected occurrence:13 Found occurrence:11
2022-03-15T06:18:28.1008577Z

win-vs2019-cpu-py3 / test (default, 2, 2, windows.4xlarge) (8/11)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-15T05:48:27.1226172Z FAIL [0.157s]: tes...uantization.fx.test_quantize_fx.TestQuantizeFxOps)

2022-03-15T05:48:26.9320207Z   test_subgraph_rewriter_placeholder_matching (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter)
2022-03-15T05:48:26.9566532Z This tests that a placeholder Node can be matched to a Node with ... ok (0.026s)
2022-03-15T05:48:26.9834469Z   test_subgraph_rewriter_preserves_logic (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.027s)
2022-03-15T05:48:27.0063531Z   test_subgraph_rewriter_replaces_referenced_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.022s)
2022-03-15T05:48:27.0326844Z   test_subgraph_rewriter_single_pattern_match (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.024s)
2022-03-15T05:48:27.0706700Z   test_subgraph_rewriter_traced_as_callable (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.031s)
2022-03-15T05:48:27.0963201Z   test_subgraph_rewriter_with_oneliner_pattern (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.031s)
2022-03-15T05:48:27.1224574Z   test_subgraph_writer_replace_consecutive_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.016s)
2022-03-15T05:48:27.1225281Z 
2022-03-15T05:48:27.1225547Z ======================================================================
2022-03-15T05:48:27.1226172Z FAIL [0.157s]: test_fixed_qparams_ops (quantization.fx.test_quantize_fx.TestQuantizeFxOps)
2022-03-15T05:48:27.1226895Z ----------------------------------------------------------------------
2022-03-15T05:48:27.1227407Z Traceback (most recent call last):
2022-03-15T05:48:27.1230659Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_quantization.py", line 278, in wrapper
2022-03-15T05:48:27.1231370Z     fn(*args, **kwargs)
2022-03-15T05:48:27.1232097Z   File "C:\actions-runner\_work\pytorch\pytorch\test\quantization\fx\test_quantize_fx.py", line 5569, in test_fixed_qparams_ops
2022-03-15T05:48:27.1232875Z     self.checkGraphModuleNodes(
2022-03-15T05:48:27.1233838Z   File "C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\testing\_internal\common_quantization.py", line 609, in checkGraphModuleNodes
2022-03-15T05:48:27.1234637Z     self.assertTrue(
2022-03-15T05:48:27.1235488Z AssertionError: False is not true : Check failed for node:'call_function' <built-in method quantize_per_tensor of type object at 0x00007FFC8A7297C0> Expected occurrence:13 Found occurrence:11
2022-03-15T05:48:27.1236171Z

linux-bionic-py3.7-clang9 / test (noarch, 1, 1, linux.2xlarge) (9/11)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-15T05:46:24.3688521Z FAIL [0.140s]: tes...uantization.fx.test_quantize_fx.TestQuantizeFxOps)

2022-03-15T05:46:24.1387571Z   test_subgraph_rewriter_placeholder_matching (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter)
2022-03-15T05:46:24.1690113Z This tests that a placeholder Node can be matched to a Node with ... ok (0.032s)
2022-03-15T05:46:24.2007025Z   test_subgraph_rewriter_preserves_logic (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.032s)
2022-03-15T05:46:24.2281059Z   test_subgraph_rewriter_replaces_referenced_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.027s)
2022-03-15T05:46:24.2594647Z   test_subgraph_rewriter_single_pattern_match (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.031s)
2022-03-15T05:46:24.3060004Z   test_subgraph_rewriter_traced_as_callable (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.046s)
2022-03-15T05:46:24.3373924Z   test_subgraph_rewriter_with_oneliner_pattern (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.031s)
2022-03-15T05:46:24.3687864Z   test_subgraph_writer_replace_consecutive_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.031s)
2022-03-15T05:46:24.3688162Z 
2022-03-15T05:46:24.3688241Z ======================================================================
2022-03-15T05:46:24.3688521Z FAIL [0.140s]: test_fixed_qparams_ops (quantization.fx.test_quantize_fx.TestQuantizeFxOps)
2022-03-15T05:46:24.3689294Z ----------------------------------------------------------------------
2022-03-15T05:46:24.3689592Z Traceback (most recent call last):
2022-03-15T05:46:24.3692690Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_quantization.py", line 278, in wrapper
2022-03-15T05:46:24.3693147Z     fn(*args, **kwargs)
2022-03-15T05:46:24.3693606Z   File "/var/lib/jenkins/workspace/test/quantization/fx/test_quantize_fx.py", line 5572, in test_fixed_qparams_ops
2022-03-15T05:46:24.3694142Z     expected_node_list=reference_order_check)
2022-03-15T05:46:24.3694847Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_quantization.py", line 613, in checkGraphModuleNodes
2022-03-15T05:46:24.3695528Z     ' Found occurrence:' + str(nodes_in_graph[expected_node]))
2022-03-15T05:46:24.3696519Z AssertionError: False is not true : Check failed for node:'call_function' <built-in method quantize_per_tensor of type object at 0x7f7492c06b68> Expected occurrence:13 Found occurrence:11
2022-03-15T05:46:24.3697058Z

linux-xenial-py3.7-clang7-asan / test (default, 2, 3, linux.2xlarge) (10/11)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-15T05:27:11.7174530Z FAIL [0.171s]: tes...uantization.fx.test_quantize_fx.TestQuantizeFxOps)

2022-03-15T05:27:11.5259619Z   test_subgraph_rewriter_placeholder_matching (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter)
2022-03-15T05:27:11.5505333Z This tests that a placeholder Node can be matched to a Node with ... ok (0.026s)
2022-03-15T05:27:11.5771347Z   test_subgraph_rewriter_preserves_logic (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.027s)
2022-03-15T05:27:11.6009106Z   test_subgraph_rewriter_replaces_referenced_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.024s)
2022-03-15T05:27:11.6280104Z   test_subgraph_rewriter_single_pattern_match (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.027s)
2022-03-15T05:27:11.6652010Z   test_subgraph_rewriter_traced_as_callable (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.037s)
2022-03-15T05:27:11.6907994Z   test_subgraph_rewriter_with_oneliner_pattern (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.026s)
2022-03-15T05:27:11.7172113Z   test_subgraph_writer_replace_consecutive_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.026s)
2022-03-15T05:27:11.7172639Z 
2022-03-15T05:27:11.7172756Z ======================================================================
2022-03-15T05:27:11.7174530Z FAIL [0.171s]: test_fixed_qparams_ops (quantization.fx.test_quantize_fx.TestQuantizeFxOps)
2022-03-15T05:27:11.7176893Z ----------------------------------------------------------------------
2022-03-15T05:27:11.7177385Z Traceback (most recent call last):
2022-03-15T05:27:11.7177967Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_quantization.py", line 278, in wrapper
2022-03-15T05:27:11.7178297Z     fn(*args, **kwargs)
2022-03-15T05:27:11.7178590Z   File "/var/lib/jenkins/workspace/test/quantization/fx/test_quantize_fx.py", line 5572, in test_fixed_qparams_ops
2022-03-15T05:27:11.7179381Z     expected_node_list=reference_order_check)
2022-03-15T05:27:11.7180119Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_quantization.py", line 613, in checkGraphModuleNodes
2022-03-15T05:27:11.7180608Z     ' Found occurrence:' + str(nodes_in_graph[expected_node]))
2022-03-15T05:27:11.7181483Z AssertionError: False is not true : Check failed for node:'call_function' <built-in method quantize_per_tensor of type object at 0x7f39169482c0> Expected occurrence:13 Found occurrence:11
2022-03-15T05:27:11.7181895Z

linux-xenial-py3.7-gcc7 / test (default, 2, 2, linux.2xlarge) (11/11)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2022-03-15T05:04:08.1568914Z FAIL [0.149s]: tes...uantization.fx.test_quantize_fx.TestQuantizeFxOps)

2022-03-15T05:04:07.9036519Z   test_subgraph_rewriter_placeholder_matching (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter)
2022-03-15T05:04:07.9372868Z This tests that a placeholder Node can be matched to a Node with ... ok (0.035s)
2022-03-15T05:04:07.9726439Z   test_subgraph_rewriter_preserves_logic (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.035s)
2022-03-15T05:04:08.0018294Z   test_subgraph_rewriter_replaces_referenced_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.029s)
2022-03-15T05:04:08.0374344Z   test_subgraph_rewriter_single_pattern_match (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.036s)
2022-03-15T05:04:08.0883445Z   test_subgraph_rewriter_traced_as_callable (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.051s)
2022-03-15T05:04:08.1224988Z   test_subgraph_rewriter_with_oneliner_pattern (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.034s)
2022-03-15T05:04:08.1568247Z   test_subgraph_writer_replace_consecutive_submodules (quantization.fx.test_subgraph_rewriter.TestSubgraphRewriter) ... ok (0.034s)
2022-03-15T05:04:08.1568537Z 
2022-03-15T05:04:08.1568631Z ======================================================================
2022-03-15T05:04:08.1568914Z FAIL [0.149s]: test_fixed_qparams_ops (quantization.fx.test_quantize_fx.TestQuantizeFxOps)
2022-03-15T05:04:08.1569578Z ----------------------------------------------------------------------
2022-03-15T05:04:08.1570251Z Traceback (most recent call last):
2022-03-15T05:04:08.1571020Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_quantization.py", line 278, in wrapper
2022-03-15T05:04:08.1571376Z     fn(*args, **kwargs)
2022-03-15T05:04:08.1571656Z   File "/var/lib/jenkins/workspace/test/quantization/fx/test_quantize_fx.py", line 5572, in test_fixed_qparams_ops
2022-03-15T05:04:08.1571976Z     expected_node_list=reference_order_check)
2022-03-15T05:04:08.1572523Z   File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_quantization.py", line 613, in checkGraphModuleNodes
2022-03-15T05:04:08.1573015Z     ' Found occurrence:' + str(nodes_in_graph[expected_node]))
2022-03-15T05:04:08.1573564Z AssertionError: False is not true : Check failed for node:'call_function' <built-in method quantize_per_tensor of type object at 0x7f70fc022be0> Expected occurrence:13 Found occurrence:11
2022-03-15T05:04:08.1573916Z

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

ghstack-source-id: 801c039 Pull Request resolved: #73510

[ghstack-poisoned]

ghstack-source-id: 8c36a5c Pull Request resolved: #73510

…ion for conv2d_cudnn" to use packed parameters Summary: The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. [ghstack-poisoned]

…ion for conv2d_cudnn to use packed parameters" Summary: The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. [ghstack-poisoned]

…v2d_cudnn to use packed parameters Summary: The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. ghstack-source-id: 51c545a Pull Request resolved: #73510

…ion for conv2d_cudnn to use packed parameters" Summary: The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Note the following: apply_impl is now what was formerly raw_cudnn_convolution_forward apply_impl_helper is now what was formerly raw_cudnn_convolution_forward_out Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. [ghstack-poisoned]

…v2d_cudnn to use packed parameters Summary: The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Note the following: apply_impl is now what was formerly raw_cudnn_convolution_forward apply_impl_helper is now what was formerly raw_cudnn_convolution_forward_out Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. ghstack-source-id: 9f024b0 Pull Request resolved: #73510

dzdang · 2022-03-04T13:24:14Z

@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

dzdang · 2022-03-04T17:34:17Z

@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

dzdang · 2022-03-04T17:38:44Z

@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…ion for conv2d_cudnn to use packed parameters" Summary: The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Note the following: apply_impl is now what was formerly raw_cudnn_convolution_forward apply_impl_helper is now what was formerly raw_cudnn_convolution_forward_out Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) [ghstack-poisoned]

…for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. [ghstack-poisoned]

…cudnn to use packed parameters (Reland PR#73510) Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. ghstack-source-id: a5a2ffe Pull Request resolved: #74212

…cudnn to use packed parameters (Reland PR#73510) Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. ghstack-source-id: ee9497a Pull Request resolved: #74212

…for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. [ghstack-poisoned]

…cudnn to use packed parameters (Reland PR#73510) Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. [ghstack-poisoned]

…cudnn to use packed parameters (Reland PR#73510) Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. ghstack-source-id: d37922b Pull Request resolved: #74220

…for conv2d_cudnn to use packed parameters" Summary: The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Note the following: apply_impl is now what was formerly raw_cudnn_convolution_forward apply_impl_helper is now what was formerly raw_cudnn_convolution_forward_out Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. Differential Revision: [D34803275](https://our.internmc.facebook.com/intern/diff/D34803275) [ghstack-poisoned]

…cudnn to use packed parameters (#73510) Summary: Pull Request resolved: #73510 The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Note the following: apply_impl is now what was formerly raw_cudnn_convolution_forward apply_impl_helper is now what was formerly raw_cudnn_convolution_forward_out Test Plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. Reviewed By: jerryzh168 Pulled By: dzdang (cherry picked from commit 03d9e68) ghstack-source-id: a6b7c04

… implementation for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. We also renamed cudnnpack_utils.h to utils.h. Differential Revision: [D34886988](https://our.internmc.facebook.com/intern/diff/D34886988) [ghstack-poisoned]

…for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. We also renamed cudnnpack_utils.h to utils.h. Differential Revision: [D34886988](https://our.internmc.facebook.com/intern/diff/D34886988) [ghstack-poisoned]

… implementation for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. We also renamed cudnnpack_utils.h to utils.h. Differential Revision: [D34886988](https://our.internmc.facebook.com/intern/diff/D34886988) [ghstack-poisoned]

…for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. We also renamed cudnnpack_utils.h to utils.h. Differential Revision: [D34886988](https://our.internmc.facebook.com/intern/diff/D34886988) [ghstack-poisoned]

… implementation for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. We also renamed cudnnpack_utils.h to utils.h. Differential Revision: [D34886988](https://our.internmc.facebook.com/intern/diff/D34886988) [ghstack-poisoned]

…for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. We also renamed cudnnpack_utils.h to utils.h. Differential Revision: [D34886988](https://our.internmc.facebook.com/intern/diff/D34886988) [ghstack-poisoned]

… implementation for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. We also renamed cudnnpack_utils.h to utils.h. Differential Revision: [D34886988](https://our.internmc.facebook.com/intern/diff/D34886988) [ghstack-poisoned]

…for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. We also renamed cudnnpack_utils.h to utils.h. Differential Revision: [D34886988](https://our.internmc.facebook.com/intern/diff/D34886988) [ghstack-poisoned]

…cudnn to use packed parameters (Reland PR#73510) (#74220) Summary: Pull Request resolved: #74220 This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. Test Plan: Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34886988 Pulled By: dzdang fbshipit-source-id: 9bdedcd6686956070cd1da8e76b69c0c76f34403

…cudnn to use packed parameters (Reland PR#73510) (#74220) Summary: Pull Request resolved: #74220 This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. Test Plan: Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34886988 Pulled By: dzdang fbshipit-source-id: 9bdedcd6686956070cd1da8e76b69c0c76f34403 (cherry picked from commit 2a047a6)

…cudnn to use packed parameters (Reland PR#73510) (#74220) Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. [ghstack-poisoned]

…for conv2d_cudnn to use packed parameters (Reland PR#73510) (#74220)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. [ghstack-poisoned]

… implementation for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. Differential Revision: [D34934630](https://our.internmc.facebook.com/intern/diff/D34934630) [ghstack-poisoned]

…for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. Differential Revision: [D34934630](https://our.internmc.facebook.com/intern/diff/D34934630) [ghstack-poisoned]

…cudnn to use packed parameters (Reland PR#73510) (#74318) Summary: Pull Request resolved: #74318 This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. Differential Revision: D34934630 D34934630 Test Plan: Imported from OSS Reviewed By: jerryzh168 Pulled By: dzdang fbshipit-source-id: 04bdc7c4123dc60e82504b032dbde484427dda93

…cudnn to use packed parameters (Reland PR#73510) (#74318) Summary: Pull Request resolved: #74318 This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. Differential Revision: D34934630 D34934630 Test Plan: Imported from OSS Reviewed By: jerryzh168 Pulled By: dzdang fbshipit-source-id: 04bdc7c4123dc60e82504b032dbde484427dda93 (cherry picked from commit 7bf6729)

draft

c751a5d

[ghstack-poisoned]

dzdang mentioned this pull request Feb 28, 2022

[Quant][core][gpu] Implemented support for bias in quantized conv operator in cudnn #73035

Closed

pytorch-bot bot added the ciflow/default label Feb 28, 2022

This was referenced Feb 28, 2022

[Quant][core][gpu] Implemented support for relu in quantized conv operator using cudnn #73337

Closed

[Quant][core] Merged conv packed params and linear packed params #73486

Closed

facebook-github-bot added the cla signed label Feb 28, 2022

dzdang added a commit that referenced this pull request Feb 28, 2022

draft

43cec4e

ghstack-source-id: 801c039 Pull Request resolved: #73510

Update on "draft"

1faf9af

[ghstack-poisoned]

dzdang added a commit that referenced this pull request Mar 2, 2022

draft

1d1e8dc

ghstack-source-id: 8c36a5c Pull Request resolved: #73510

dzdang changed the title ~~draft~~ [Quant][core][gpu][refactorization] Refactored implementation for conv2d_cudnn Mar 3, 2022

dzdang changed the title ~~[Quant][core][gpu][refactorization] Refactored implementation for conv2d_cudnn~~ [Quant][core][gpu][refactorization] Refactored implementation for conv2d_cudnn to use packed parameters Mar 3, 2022

dzdang mentioned this pull request Mar 4, 2022

[Quant][core][refactorization] Refactored qconv_unpack.cpp into an implementation file and higher level call registration and definition file #73773

Closed

dzdang requested a review from jerryzh168 March 4, 2022 04:39

dzdang mentioned this pull request Mar 7, 2022

[quant][core][performance] Removed int_repr calls in quantized conv2d cudnn implementation #73849

Closed

dzdang added 2 commits March 7, 2022 09:14

dzdang mentioned this pull request Mar 7, 2022

[quant][core][performance] Changed cudnn quantized conv2d impl to use inplace operations #73857

Closed

dzdang added the ciflow/macos label Mar 7, 2022

dzdang mentioned this pull request Mar 15, 2022

[Quant][core][gpu][improvement] Refactored implementation for conv2d_cudnn to use packed parameters (Reland PR#73510) #74220

Closed

dzdang reopened this Mar 15, 2022

dzdang closed this Mar 15, 2022

dzdang mentioned this pull request Mar 16, 2022

[Quant][core][gpu][improvement] Refactored implementation for conv2d_cudnn to use packed parameters (Reland PR#73510) #74318

Closed

facebook-github-bot deleted the gh/dzdang/42/head branch March 18, 2022 14:17

WBobby mentioned this pull request Aug 17, 2022

Add ROCm5.2.3/AMDGPU support for PyTorch WBobby/pytorch#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Quant][core][gpu][improvement] Refactored implementation for conv2d_cudnn to use packed parameters #73510

[Quant][core][gpu][improvement] Refactored implementation for conv2d_cudnn to use packed parameters #73510

Uh oh!

dzdang commented Feb 28, 2022 •

edited

Loading

Uh oh!

pytorch-bot bot commented Feb 28, 2022

⚛️ CI Flow

Uh oh!

facebook-github-bot commented Feb 28, 2022 •

edited

Loading

Uh oh!

dzdang commented Mar 4, 2022

Uh oh!

dzdang commented Mar 4, 2022

Uh oh!

dzdang commented Mar 4, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[Quant][core][gpu][improvement] Refactored implementation for conv2d_cudnn to use packed parameters #73510

[Quant][core][gpu][improvement] Refactored implementation for conv2d_cudnn to use packed parameters #73510

Uh oh!

Conversation

dzdang commented Feb 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 28, 2022

⚛️ CI Flow

Uh oh!

facebook-github-bot commented Feb 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

💊 CI failures summary and remediations

🕵️ 11 new failures recognized by patterns

linux-xenial-py3.7-gcc5.4 / test (default, 2, 2, linux.2xlarge) (1/11)

win-vs2019-cuda11.3-py3 / test (force_on_cpu, 1, 1, windows.4xlarge) (2/11)

linux-xenial-cuda11.3-py3.7-gcc7 / test (default, 2, 2, linux.4xlarge.nvidia.gpu) (3/11)

linux-bionic-py3.7-clang9 / test (default, 2, 2, linux.2xlarge) (4/11)

linux-xenial-py3.7-gcc5.4 / test (backwards_compat, 1, 1, linux.2xlarge) (5/11)

linux-bionic-rocm4.5-py3.7 / test (default, 1, 2, linux.rocm.gpu) (6/11)

win-vs2019-cuda11.3-py3 / test (default, 2, 2, windows.8xlarge.nvidia.gpu) (7/11)

win-vs2019-cpu-py3 / test (default, 2, 2, windows.4xlarge) (8/11)

linux-bionic-py3.7-clang9 / test (noarch, 1, 1, linux.2xlarge) (9/11)

linux-xenial-py3.7-clang7-asan / test (default, 2, 3, linux.2xlarge) (10/11)

linux-xenial-py3.7-gcc7 / test (default, 2, 2, linux.2xlarge) (11/11)

Uh oh!

dzdang commented Mar 4, 2022

Uh oh!

dzdang commented Mar 4, 2022

Uh oh!

dzdang commented Mar 4, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

dzdang commented Feb 28, 2022 •

edited

Loading

facebook-github-bot commented Feb 28, 2022 •

edited

Loading