Skip to content

Conversation

@dzdang
Copy link
Contributor

@dzdang dzdang commented Mar 21, 2022

Stack from ghstack (oldest at bottom):

Summary:
This PR implements the quantized add operator using cudnn operations.
Also added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build. Other factors also
complicate the merge as cudnn quantized add is currently only supported for
int8 symmetrically quantized tensors.

Test Plan:
In pytorch main dir, execute

python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn

Differential Revision: D35218224

Summary:
This PR implements the quantized add operator using cudnn operations.
A test case was added ....

Test Plan:
TBA

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 21, 2022

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/9f6106538642a9b5dfa9cbd3147a4e2962606975/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows Labels (bold enabled) Status
Triggered Workflows
deploy-linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
linux-binary-libtorch-cxx11-abi ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
linux-binary-libtorch-pre-cxx11 ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
linux-binary-manywheel ciflow/all, ciflow/binaries, ciflow/binaries_wheel, ciflow/default, ciflow/trunk ✅ triggered
linux-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/trunk ✅ triggered
linux-bionic-rocm4.5-py3.7 ciflow/all, ciflow/default, ciflow/linux, ciflow/rocm, ciflow/trunk ✅ triggered
linux-docs ciflow/all, ciflow/cpu, ciflow/default, ciflow/docs, ciflow/linux, ciflow/trunk ✅ triggered
linux-vulkan-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4-mobile-lightweight-dispatch-build ciflow/all, ciflow/cpu, ciflow/default, ciflow/libtorch, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7-no-ops ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
macos-arm64-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-arm64-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
macos-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
windows-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
windows-binary-libtorch-debug ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
windows-binary-libtorch-release ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
windows-binary-wheel ciflow/all, ciflow/binaries, ciflow/binaries_wheel, ciflow/default, ciflow/trunk ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
docker-builds ciflow/all, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-custom-ops ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-metal ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-x86-64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow, ciflow/trunk 🚫 skipped
linux-bionic-rocm4.5-py3.7-distributed ciflow/all, ciflow/linux, ciflow/rocm, ciflow/trunk 🚫 skipped
linux-docs-push ciflow/all, ciflow/cpu, ciflow/linux, ciflow/scheduled 🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops ciflow/all, ciflow/cuda, ciflow/linux, ciflow/trunk 🚫 skipped
macos-10-15-py3-arm64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-11-py3-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.3-py3.7-gcc7-debug ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.5-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build ciflow/all, ciflow/android, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
pytorch-xla-linux-bionic-py3.7-clang8 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk, ciflow/xla 🚫 skipped

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Mar 21, 2022

🔗 Helpful links

💊 CI failures summary and remediations

As of commit d378f4c (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@dzdang
Copy link
Contributor Author

dzdang commented Mar 21, 2022

@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

… cudnn"

Summary:
This PR implements the quantized add operator using cudnn operations.
A test case was added ....

Test Plan:
TBA

Differential Revision: [D35009111](https://our.internmc.facebook.com/intern/diff/D35009111)

[ghstack-poisoned]
dzdang added a commit that referenced this pull request Mar 21, 2022
Summary:
This PR implements the quantized add operator using cudnn operations.
A test case was added ....

Test Plan:
TBA

ghstack-source-id: 2b18896
Pull Request resolved: #74463
… cudnn"

Summary:
This PR implements the quantized add operator using cudnn operations.
A test case was added ....

Test Plan:
TBA

Differential Revision: [D35009111](https://our.internmc.facebook.com/intern/diff/D35009111)

[ghstack-poisoned]
dzdang added a commit that referenced this pull request Mar 21, 2022
Summary:
This PR implements the quantized add operator using cudnn operations.
A test case was added ....

Test Plan:
TBA

ghstack-source-id: cf3aabc
Pull Request resolved: #74463
@dzdang dzdang marked this pull request as draft March 22, 2022 00:40
… cudnn"

Summary:
This PR implements the quantized add operator using cudnn operations.
A test case was added ....

Test Plan:
TBA

Differential Revision: [D35009111](https://our.internmc.facebook.com/intern/diff/D35009111)

[ghstack-poisoned]
… cudnn"

Summary:
This PR implements the quantized add operator using cudnn operations.
Added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

Differential Revision: [D35009111](https://our.internmc.facebook.com/intern/diff/D35009111)

[ghstack-poisoned]
… cudnn"

Summary:
This PR implements the quantized add operator using cudnn operations.
Added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

Differential Revision: [D35009111](https://our.internmc.facebook.com/intern/diff/D35009111)

[ghstack-poisoned]
dzdang added a commit that referenced this pull request Mar 22, 2022
Summary:
This PR implements the quantized add operator using cudnn operations.
Added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

ghstack-source-id: 908412b
Pull Request resolved: #74463
…udnn"

Summary:
This PR implements the quantized add operator using cudnn operations.
Added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

Differential Revision: [D35009111](https://our.internmc.facebook.com/intern/diff/D35009111)

[ghstack-poisoned]
@dzdang dzdang changed the title [quant][gpu][core] Implemented quantized added operator in cudnn [quant][gpu][core] Implemented quantized add operator in cudnn Mar 22, 2022
dzdang added a commit that referenced this pull request Mar 22, 2022
Summary:
This PR implements the quantized add operator using cudnn operations.
Added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

ghstack-source-id: ef6fbf9
Pull Request resolved: #74463
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 22, 2022

We have recently simplified the CIFlow labels and ciflow/macos is no longer in use.
You can use any of the following

  • ciflow/trunk (.github/workflowss/trunk.yml): all jobs we run per-commit on master
  • ciflow/periodic (.github/workflows/periodic.yml): all jobs we run periodically on master
  • ciflow/all: trunk + periodic; all jobs we run in master CI
  • ciflow/nightly (.github/workflows/nightly.yml): all jobs we run nightly
  • ciflow/binaries: all binary build and upload jobs

…udnn"

Summary:
This PR implements the quantized add operator using cudnn operations.
Added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

Differential Revision: [D35009111](https://our.internmc.facebook.com/intern/diff/D35009111)

[ghstack-poisoned]
…g cudnn"

Summary:
This PR implements the quantized add operator using cudnn operations.
Also added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build. Other factors also
complicate the merge as cudnn quantized add is currently only supported for
int8 symmetrically quantized tensors.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

Differential Revision: [D35009111](https://our.internmc.facebook.com/intern/diff/D35009111)

[ghstack-poisoned]
…g cudnn"

Summary:
This PR implements the quantized add operator using cudnn operations.
Also added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build. Other factors also
complicate the merge as cudnn quantized add is currently only supported for
int8 symmetrically quantized tensors.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

Differential Revision: [D35009111](https://our.internmc.facebook.com/intern/diff/D35009111)

[ghstack-poisoned]
dzdang added a commit that referenced this pull request Mar 25, 2022
Summary:
This PR implements the quantized add operator using cudnn operations.
Also added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build. Other factors also
complicate the merge as cudnn quantized add is currently only supported for
int8 symmetrically quantized tensors.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

ghstack-source-id: 4de9f48
Pull Request resolved: #74463
@dzdang
Copy link
Contributor Author

dzdang commented Mar 25, 2022

@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

1 similar comment
@dzdang
Copy link
Contributor Author

dzdang commented Mar 28, 2022

@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Mar 28, 2022
…4463)

Summary:
Pull Request resolved: #74463

This PR implements the quantized add operator using cudnn operations.
Also added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build. Other factors also
complicate the merge as cudnn quantized add is currently only supported for
int8 symmetrically quantized tensors.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

TBA

Differential Revision:
D35009111
D35009111

Reviewed By: jerryzh168

Pulled By: dzdang

fbshipit-source-id: 13afa7f0192ffaf1f36334b1af827202c7dd0f74
std::vector<int64_t> new_sizes(3, 1);
// cudnn expects leading dimensions to be the dummy dimensions
new_sizes.back() = qa.sizes().back();
if (qa.ndim() == 2) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should be qa.dim() instead, similar to previous line 67.

@facebook-github-bot
Copy link
Contributor

This pull request has been reverted by 7bb0133. To re-land this change, please open another pull request, assignthe same reviewers, fix the CI failures that caused the revert and make sure that the failing CI runs on the PR by applying the proper ciflow label (e.g., ciflow/trunk).

…g cudnn"

Summary:
This PR implements the quantized add operator using cudnn operations.
Also added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build. Other factors also
complicate the merge as cudnn quantized add is currently only supported for
int8 symmetrically quantized tensors.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

Differential Revision: [D35009111](https://our.internmc.facebook.com/intern/diff/D35009111)

[ghstack-poisoned]
dzdang added a commit that referenced this pull request Mar 29, 2022
Summary:
This PR implements the quantized add operator using cudnn operations.
Also added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build. Other factors also
complicate the merge as cudnn quantized add is currently only supported for
int8 symmetrically quantized tensors.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

ghstack-source-id: eb4c059
Pull Request resolved: #74463
…g cudnn"

Summary:
This PR implements the quantized add operator using cudnn operations.
Also added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build. Other factors also
complicate the merge as cudnn quantized add is currently only supported for
int8 symmetrically quantized tensors.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

Differential Revision: [D35009111](https://our.internmc.facebook.com/intern/diff/D35009111)

[ghstack-poisoned]
@dzdang dzdang changed the title [quant][gpu][core] Implemented quantized add operator using cudnn [quant][gpu][core] Implemented quantized add operator using cudnn [reland PR74463] Mar 29, 2022
…g cudnn [reland PR74463]"

Summary:
This PR implements the quantized add operator using cudnn operations.
Also added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build. Other factors also
complicate the merge as cudnn quantized add is currently only supported for
int8 symmetrically quantized tensors.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

Differential Revision: [D35009111](https://our.internmc.facebook.com/intern/diff/D35009111)

[ghstack-poisoned]
dzdang added a commit that referenced this pull request Mar 29, 2022
…land PR74463]

Summary:
This PR implements the quantized add operator using cudnn operations.
Also added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build. Other factors also
complicate the merge as cudnn quantized add is currently only supported for
int8 symmetrically quantized tensors.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

ghstack-source-id: 4713a57
Pull Request resolved: #74463
…g cudnn [reland PR74463]"

Summary:
This PR implements the quantized add operator using cudnn operations.
Also added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build. Other factors also
complicate the merge as cudnn quantized add is currently only supported for
int8 symmetrically quantized tensors.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```
a

Differential Revision: [D35009111](https://our.internmc.facebook.com/intern/diff/D35009111)

[ghstack-poisoned]
…g cudnn [reland PR74463]"

Summary:
This PR implements the quantized add operator using cudnn operations.
Also added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build. Other factors also
complicate the merge as cudnn quantized add is currently only supported for
int8 symmetrically quantized tensors.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

[ghstack-poisoned]
dzdang added a commit that referenced this pull request Mar 29, 2022
…land PR74463]

Summary:
This PR implements the quantized add operator using cudnn operations.
Also added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build. Other factors also
complicate the merge as cudnn quantized add is currently only supported for
int8 symmetrically quantized tensors.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

ghstack-source-id: 4713a57
Pull Request resolved: #74463
@dzdang
Copy link
Contributor Author

dzdang commented Mar 29, 2022

@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Mar 30, 2022
…land PR74463] (#74463)

Summary:
Pull Request resolved: #74463

This PR implements the quantized add operator using cudnn operations.
Also added a corresponding test function in test_quantized_op.py. Ideally,
we should merge this function with the cpu variant, but for now, we will
keep it separate until cudnn v8 is in the default build. Other factors also
complicate the merge as cudnn quantized add is currently only supported for
int8 symmetrically quantized tensors.

Test Plan:
In pytorch main dir, execute
```
python test/test_quantization.py TestQuantizedOps.test_qadd_relu_cudnn
```

Reviewed By: ngimel

Differential Revision: D35218224

Pulled By: dzdang

fbshipit-source-id: a2e57e0b46cff655f2fb77000ea4db3a558a0851
@facebook-github-bot facebook-github-bot deleted the gh/dzdang/57/head branch April 2, 2022 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants