Skip to content

Conversation

@dzdang
Copy link
Contributor

@dzdang dzdang commented Mar 7, 2022

Stack from ghstack (oldest at bottom):

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute

python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn

for accuracy testing and

python test/test_quantization.py TestQuantizedConv.test_benchmark

for benchmark testing.

Differential Revision: D34824250

… inplace operations

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 7, 2022

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/141ee5ebbb9b0d4b94056a5fdd1a765499c90619/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
linux-binary-libtorch-cxx11-abi ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
linux-binary-libtorch-pre-cxx11 ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
linux-binary-manywheel ciflow/all, ciflow/binaries, ciflow/binaries_wheel, ciflow/default, ciflow/trunk ✅ triggered
linux-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/trunk ✅ triggered
linux-bionic-rocm4.5-py3.7 ciflow/all, ciflow/default, ciflow/linux, ciflow/rocm, ciflow/trunk ✅ triggered
linux-docs ciflow/all, ciflow/cpu, ciflow/default, ciflow/docs, ciflow/linux, ciflow/trunk ✅ triggered
linux-vulkan-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4-mobile-lightweight-dispatch-build ciflow/all, ciflow/cpu, ciflow/default, ciflow/libtorch, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7-no-ops ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
macos-arm64-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-arm64-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
macos-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
windows-binary-libtorch-debug ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
windows-binary-libtorch-release ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
windows-binary-wheel ciflow/all, ciflow/binaries, ciflow/binaries_wheel, ciflow/default, ciflow/trunk ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
docker-builds ciflow/all, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-custom-ops ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-metal ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-x86-64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow, ciflow/trunk 🚫 skipped
linux-docs-push ciflow/all, ciflow/cpu, ciflow/linux, ciflow/scheduled 🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops ciflow/all, ciflow/cuda, ciflow/linux, ciflow/trunk 🚫 skipped
macos-10-15-py3-arm64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-11-py3-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.3-py3.7-gcc7-debug ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.5-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build ciflow/all, ciflow/android, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
pytorch-xla-linux-bionic-py3.7-clang8 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk, ciflow/xla 🚫 skipped

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Mar 7, 2022

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 9dc0452 (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

1 failure not recognized by patterns:

Job Step Action
GitHub Actions ios-12-5-1-arm64-custom-ops / build Unknown 🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

dzdang added a commit that referenced this pull request Mar 7, 2022
… inplace operations

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

ghstack-source-id: 579184b
Pull Request resolved: #73857
…impl to use inplace operations"

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

[ghstack-poisoned]
…impl to use inplace operations"

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

[ghstack-poisoned]
dzdang added a commit that referenced this pull request Mar 7, 2022
… inplace operations

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

ghstack-source-id: 1168e3b
Pull Request resolved: #73857
…impl to use inplace operations"

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

[ghstack-poisoned]
dzdang added a commit that referenced this pull request Mar 7, 2022
… inplace operations

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

ghstack-source-id: 5021156
Pull Request resolved: #73857
…impl to use inplace operations"

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

[ghstack-poisoned]
…impl to use inplace operations"

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

[ghstack-poisoned]
dzdang added a commit that referenced this pull request Mar 8, 2022
… inplace operations

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

ghstack-source-id: e8e1fd2
Pull Request resolved: #73857
@dzdang dzdang requested a review from jerryzh168 March 8, 2022 14:21
…impl to use inplace operations"

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

[ghstack-poisoned]
…impl to use inplace operations"

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

[ghstack-poisoned]
dzdang added 3 commits March 10, 2022 13:13
…impl to use inplace operations"

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

[ghstack-poisoned]
…impl to use inplace operations"

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

[ghstack-poisoned]
…impl to use inplace operations"

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

[ghstack-poisoned]
Copy link
Contributor

@jerryzh168 jerryzh168 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great! merge conflict should be resolved in the previous PR I think.

you can use git rebase -i origin/master and edit the commit history, use e or edit for the PR that you want to edit. (previous PR)

dzdang added 2 commits March 11, 2022 10:08
…impl to use inplace operations"

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

[ghstack-poisoned]
…impl to use inplace operations"

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

[ghstack-poisoned]
@dzdang
Copy link
Contributor Author

dzdang commented Mar 11, 2022

@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…impl to use inplace operations"

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

Differential Revision: [D34824250](https://our.internmc.facebook.com/intern/diff/D34824250)

[ghstack-poisoned]
…impl to use inplace operations"

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

Differential Revision: [D34824250](https://our.internmc.facebook.com/intern/diff/D34824250)

[ghstack-poisoned]
…impl to use inplace operations"

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

Differential Revision: [D34824250](https://our.internmc.facebook.com/intern/diff/D34824250)

[ghstack-poisoned]
…impl to use inplace operations"

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

Differential Revision: [D34824250](https://our.internmc.facebook.com/intern/diff/D34824250)

[ghstack-poisoned]
dzdang added a commit that referenced this pull request Mar 17, 2022
… inplace operations

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

ghstack-source-id: e85ca45
Pull Request resolved: #73857
…impl to use inplace operations"

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

Differential Revision: [D34824250](https://our.internmc.facebook.com/intern/diff/D34824250)

[ghstack-poisoned]
dzdang added a commit that referenced this pull request Mar 17, 2022
… inplace operations

Summary:
This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

ghstack-source-id: d381df7
Pull Request resolved: #73857
@dzdang
Copy link
Contributor Author

dzdang commented Mar 17, 2022

@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Mar 18, 2022
… inplace operations (#73857)

Summary:
Pull Request resolved: #73857

This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test Plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

Reviewed By: ezyang

Differential Revision: D34824250

Pulled By: dzdang

fbshipit-source-id: 4d0d2fd61245d4a2cbbdffb910eb73a5807237fd
@github-actions
Copy link
Contributor

Hey @dzdang.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

@facebook-github-bot facebook-github-bot deleted the gh/dzdang/48/head branch March 21, 2022 14:17
shahofblah pushed a commit that referenced this pull request Mar 25, 2022
… inplace operations (#73857)

Summary:
Pull Request resolved: #73857

This PR changed the implementation for the conv2d cudnn operator to use inplace ops.
This increases the quantized conv operator's efficiency when bias and/or relu is used.
Based on discussions, to support inplace operations, unique uids need to be assigned
to the input and output even if it is stored at the same memory address.
e.g., see the different uids in the current implementation assigned to conv_output.data_ptr

Test Plan:
In pytorch main directory, execute
```
python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn
```
for accuracy testing and
```
python test/test_quantization.py TestQuantizedConv.test_benchmark
```
for benchmark testing.

Reviewed By: ezyang

Differential Revision: D34824250

Pulled By: dzdang

fbshipit-source-id: 4d0d2fd61245d4a2cbbdffb910eb73a5807237fd
(cherry picked from commit fe21915)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants