[quant][core][performance] Changed cudnn quantized conv2d impl to use inplace operations #73857

dzdang · 2022-03-07T17:56:00Z

Stack from ghstack (oldest at bottom):

-> [quant][core][performance] Changed cudnn quantized conv2d impl to use inplace operations #73857
[quant][core][performance] Removed int_repr calls in quantized conv2d cudnn implementation #73849

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute

python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn

for accuracy testing and

python test/test_quantization.py TestQuantizedConv.test_benchmark

for benchmark testing.

Differential Revision: D34824250

… inplace operations Summary: This PR changed the implementation for the conv2d cudnn operator to use inplace ops. This increases the quantized conv operator's efficiency when bias and/or relu is used. Based on discussions, to support inplace operations, unique uids need to be assigned to the input and output even if it is stored at the same memory address. e.g., see the different uids in the current implementation assigned to conv_output.data_ptr Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. [ghstack-poisoned]

pytorch-bot · 2022-03-07T17:56:04Z

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/141ee5ebbb9b0d4b94056a5fdd1a765499c90619/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows	Labels (bold enabled)	Status
Triggered Workflows
linux-binary-conda	`ciflow/binaries`, `ciflow/binaries_conda`, `ciflow/default`	✅ triggered
linux-binary-libtorch-cxx11-abi	`ciflow/all`, `ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`, `ciflow/trunk`	✅ triggered
linux-binary-libtorch-pre-cxx11	`ciflow/all`, `ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`, `ciflow/trunk`	✅ triggered
linux-binary-manywheel	`ciflow/all`, `ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`, `ciflow/trunk`	✅ triggered
linux-bionic-py3.7-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/noarch`, `ciflow/trunk`	✅ triggered
linux-bionic-rocm4.5-py3.7	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/rocm`, `ciflow/trunk`	✅ triggered
linux-docs	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/docs`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-vulkan-bionic-py3.7-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`, `ciflow/vulkan`	✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test	`ciflow/all`, `ciflow/bazel`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3-clang5-mobile-build	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-clang7-asan	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/sanitizers`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-clang7-onnx	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/onnx`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc5.4-mobile-lightweight-dispatch-build	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc7	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc7-no-ops	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
macos-arm64-binary-conda	`ciflow/binaries`, `ciflow/binaries_conda`, `ciflow/default`	✅ triggered
macos-arm64-binary-wheel	`ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`	✅ triggered
macos-binary-conda	`ciflow/binaries`, `ciflow/binaries_conda`, `ciflow/default`	✅ triggered
macos-binary-libtorch-cxx11-abi	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
macos-binary-libtorch-pre-cxx11	`ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`	✅ triggered
macos-binary-wheel	`ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
win-vs2019-cpu-py3	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/trunk`, `ciflow/win`	✅ triggered
win-vs2019-cuda11.3-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/trunk`, `ciflow/win`	✅ triggered
windows-binary-libtorch-debug	`ciflow/all`, `ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`, `ciflow/trunk`	✅ triggered
windows-binary-libtorch-release	`ciflow/all`, `ciflow/binaries`, `ciflow/binaries_libtorch`, `ciflow/default`, `ciflow/trunk`	✅ triggered
windows-binary-wheel	`ciflow/all`, `ciflow/binaries`, `ciflow/binaries_wheel`, `ciflow/default`, `ciflow/trunk`	✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
docker-builds	`ciflow/all`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/scheduled`	🚫 skipped
ios-12-5-1-arm64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/scheduled`	🚫 skipped
ios-12-5-1-arm64-custom-ops	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/scheduled`	🚫 skipped
ios-12-5-1-arm64-metal	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/scheduled`	🚫 skipped
ios-12-5-1-x86-64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-x86-64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/slow`, `ciflow/trunk`	🚫 skipped
linux-docs-push	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
macos-10-15-py3-arm64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
macos-11-py3-x86-64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`, `ciflow/slow`, `ciflow/slow-gradcheck`	🚫 skipped
periodic-linux-xenial-cuda11.3-py3.7-gcc7-debug	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-win-vs2019-cuda11.5-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/scheduled`, `ciflow/win`	🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
pytorch-xla-linux-bionic-py3.7-clang8	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`, `ciflow/xla`	🚫 skipped

facebook-github-bot · 2022-03-07T17:56:05Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/73857
📄 Preview docs built from this PR
📄 Preview C++ docs built from this PR
↩️ [fb-only] Re-run with SSH instructions
🔧 Opt-in to CIFlow to control what jobs run on your PRs

💊 CI failures summary and remediations

As of commit 9dc0452 (more details on the Dr. CI page):

1/1 failures introduced in this PR

1 failure not recognized by patterns:

Job	Step	Action
^{ios-12-5-1-arm64-custom-ops / build}	^Unknown	🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

… inplace operations Summary: This PR changed the implementation for the conv2d cudnn operator to use inplace ops. This increases the quantized conv operator's efficiency when bias and/or relu is used. Based on discussions, to support inplace operations, unique uids need to be assigned to the input and output even if it is stored at the same memory address. e.g., see the different uids in the current implementation assigned to conv_output.data_ptr Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. ghstack-source-id: 579184b Pull Request resolved: #73857

…impl to use inplace operations" Summary: This PR changed the implementation for the conv2d cudnn operator to use inplace ops. This increases the quantized conv operator's efficiency when bias and/or relu is used. Based on discussions, to support inplace operations, unique uids need to be assigned to the input and output even if it is stored at the same memory address. e.g., see the different uids in the current implementation assigned to conv_output.data_ptr Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. [ghstack-poisoned]

… inplace operations Summary: This PR changed the implementation for the conv2d cudnn operator to use inplace ops. This increases the quantized conv operator's efficiency when bias and/or relu is used. Based on discussions, to support inplace operations, unique uids need to be assigned to the input and output even if it is stored at the same memory address. e.g., see the different uids in the current implementation assigned to conv_output.data_ptr Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. ghstack-source-id: 1168e3b Pull Request resolved: #73857

…impl to use inplace operations" Summary: This PR changed the implementation for the conv2d cudnn operator to use inplace ops. This increases the quantized conv operator's efficiency when bias and/or relu is used. Based on discussions, to support inplace operations, unique uids need to be assigned to the input and output even if it is stored at the same memory address. e.g., see the different uids in the current implementation assigned to conv_output.data_ptr Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. [ghstack-poisoned]

… inplace operations Summary: This PR changed the implementation for the conv2d cudnn operator to use inplace ops. This increases the quantized conv operator's efficiency when bias and/or relu is used. Based on discussions, to support inplace operations, unique uids need to be assigned to the input and output even if it is stored at the same memory address. e.g., see the different uids in the current implementation assigned to conv_output.data_ptr Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. ghstack-source-id: 5021156 Pull Request resolved: #73857

…impl to use inplace operations" Summary: This PR changed the implementation for the conv2d cudnn operator to use inplace ops. This increases the quantized conv operator's efficiency when bias and/or relu is used. Based on discussions, to support inplace operations, unique uids need to be assigned to the input and output even if it is stored at the same memory address. e.g., see the different uids in the current implementation assigned to conv_output.data_ptr Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. [ghstack-poisoned]

… inplace operations Summary: This PR changed the implementation for the conv2d cudnn operator to use inplace ops. This increases the quantized conv operator's efficiency when bias and/or relu is used. Based on discussions, to support inplace operations, unique uids need to be assigned to the input and output even if it is stored at the same memory address. e.g., see the different uids in the current implementation assigned to conv_output.data_ptr Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. ghstack-source-id: e8e1fd2 Pull Request resolved: #73857

…impl to use inplace operations" Summary: This PR changed the implementation for the conv2d cudnn operator to use inplace ops. This increases the quantized conv operator's efficiency when bias and/or relu is used. Based on discussions, to support inplace operations, unique uids need to be assigned to the input and output even if it is stored at the same memory address. e.g., see the different uids in the current implementation assigned to conv_output.data_ptr Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. [ghstack-poisoned]

jerryzh168

great! merge conflict should be resolved in the previous PR I think.

you can use git rebase -i origin/master and edit the commit history, use e or edit for the PR that you want to edit. (previous PR)

…impl to use inplace operations" Summary: This PR changed the implementation for the conv2d cudnn operator to use inplace ops. This increases the quantized conv operator's efficiency when bias and/or relu is used. Based on discussions, to support inplace operations, unique uids need to be assigned to the input and output even if it is stored at the same memory address. e.g., see the different uids in the current implementation assigned to conv_output.data_ptr Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. [ghstack-poisoned]

dzdang · 2022-03-11T18:12:24Z

@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…impl to use inplace operations" Summary: This PR changed the implementation for the conv2d cudnn operator to use inplace ops. This increases the quantized conv operator's efficiency when bias and/or relu is used. Based on discussions, to support inplace operations, unique uids need to be assigned to the input and output even if it is stored at the same memory address. e.g., see the different uids in the current implementation assigned to conv_output.data_ptr Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. Differential Revision: [D34824250](https://our.internmc.facebook.com/intern/diff/D34824250) [ghstack-poisoned]

… inplace operations Summary: This PR changed the implementation for the conv2d cudnn operator to use inplace ops. This increases the quantized conv operator's efficiency when bias and/or relu is used. Based on discussions, to support inplace operations, unique uids need to be assigned to the input and output even if it is stored at the same memory address. e.g., see the different uids in the current implementation assigned to conv_output.data_ptr Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. ghstack-source-id: e85ca45 Pull Request resolved: #73857

…impl to use inplace operations" Summary: This PR changed the implementation for the conv2d cudnn operator to use inplace ops. This increases the quantized conv operator's efficiency when bias and/or relu is used. Based on discussions, to support inplace operations, unique uids need to be assigned to the input and output even if it is stored at the same memory address. e.g., see the different uids in the current implementation assigned to conv_output.data_ptr Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. Differential Revision: [D34824250](https://our.internmc.facebook.com/intern/diff/D34824250) [ghstack-poisoned]

… inplace operations Summary: This PR changed the implementation for the conv2d cudnn operator to use inplace ops. This increases the quantized conv operator's efficiency when bias and/or relu is used. Based on discussions, to support inplace operations, unique uids need to be assigned to the input and output even if it is stored at the same memory address. e.g., see the different uids in the current implementation assigned to conv_output.data_ptr Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. ghstack-source-id: d381df7 Pull Request resolved: #73857

dzdang · 2022-03-17T17:54:14Z

@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

… inplace operations (#73857) Summary: Pull Request resolved: #73857 This PR changed the implementation for the conv2d cudnn operator to use inplace ops. This increases the quantized conv operator's efficiency when bias and/or relu is used. Based on discussions, to support inplace operations, unique uids need to be assigned to the input and output even if it is stored at the same memory address. e.g., see the different uids in the current implementation assigned to conv_output.data_ptr Test Plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. Reviewed By: ezyang Differential Revision: D34824250 Pulled By: dzdang fbshipit-source-id: 4d0d2fd61245d4a2cbbdffb910eb73a5807237fd

github-actions · 2022-03-18T00:30:27Z

Hey @dzdang.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

… inplace operations (#73857) Summary: Pull Request resolved: #73857 This PR changed the implementation for the conv2d cudnn operator to use inplace ops. This increases the quantized conv operator's efficiency when bias and/or relu is used. Based on discussions, to support inplace operations, unique uids need to be assigned to the input and output even if it is stored at the same memory address. e.g., see the different uids in the current implementation assigned to conv_output.data_ptr Test Plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. Reviewed By: ezyang Differential Revision: D34824250 Pulled By: dzdang fbshipit-source-id: 4d0d2fd61245d4a2cbbdffb910eb73a5807237fd (cherry picked from commit fe21915)

dzdang mentioned this pull request Mar 7, 2022

[Quant][core] Merged conv packed params and linear packed params #73486

Closed

pytorch-bot bot added the ciflow/default label Mar 7, 2022

dzdang mentioned this pull request Mar 7, 2022

[Quant][core][gpu][improvement] Refactored implementation for conv2d_cudnn to use packed parameters #73510

Closed

facebook-github-bot added the cla signed label Mar 7, 2022

dzdang mentioned this pull request Mar 7, 2022

[Quant][core][refactorization] Refactored qconv_unpack.cpp into an implementation file and higher level call registration and definition file #73773

Closed

dzdang mentioned this pull request Mar 7, 2022

[quant][core][performance] Removed int_repr calls in quantized conv2d cudnn implementation #73849

Closed

dzdang added the ciflow/macos label Mar 7, 2022

dzdang requested a review from jerryzh168 March 8, 2022 14:21

dzdang mentioned this pull request Mar 9, 2022

[quant][gpu][core] Added quantized linear operator in cudnn #73959

Closed

dzdang added 3 commits March 10, 2022 13:13

jerryzh168 approved these changes Mar 11, 2022

View reviewed changes

dzdang added 2 commits March 11, 2022 10:08

pytorchmergebot closed this in 2c51c8c Mar 18, 2022

facebook-github-bot deleted the gh/dzdang/48/head branch March 21, 2022 14:17

WBobby mentioned this pull request Aug 17, 2022

Add ROCm5.2.3/AMDGPU support for PyTorch WBobby/pytorch#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[quant][core][performance] Changed cudnn quantized conv2d impl to use inplace operations #73857

[quant][core][performance] Changed cudnn quantized conv2d impl to use inplace operations #73857

Uh oh!

dzdang commented Mar 7, 2022 •

edited

Loading

Uh oh!

pytorch-bot bot commented Mar 7, 2022

⚛️ CI Flow

Uh oh!

facebook-github-bot commented Mar 7, 2022 •

edited

Loading

Uh oh!

jerryzh168 left a comment

Uh oh!

dzdang commented Mar 11, 2022

Uh oh!

dzdang commented Mar 17, 2022

Uh oh!

github-actions bot commented Mar 18, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[quant][core][performance] Changed cudnn quantized conv2d impl to use inplace operations #73857

[quant][core][performance] Changed cudnn quantized conv2d impl to use inplace operations #73857

Uh oh!

Conversation

dzdang commented Mar 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 7, 2022

⚛️ CI Flow

Uh oh!

facebook-github-bot commented Mar 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

💊 CI failures summary and remediations

1 failure not recognized by patterns:

Uh oh!

jerryzh168 left a comment

Choose a reason for hiding this comment

Uh oh!

dzdang commented Mar 11, 2022

Uh oh!

dzdang commented Mar 17, 2022

Uh oh!

github-actions bot commented Mar 18, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dzdang commented Mar 7, 2022 •

edited

Loading

facebook-github-bot commented Mar 7, 2022 •

edited

Loading