-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[Quant][core][gpu][improvement] Refactored implementation for conv2d_cudnn to use packed parameters #73510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CI Flow Status⚛️ CI FlowRuleset - Version:
|
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 8e973a0 (more details on the Dr. CI page):
🕵️ 11 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
[ghstack-poisoned]
…ion for conv2d_cudnn" to use packed parameters Summary: The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. [ghstack-poisoned]
…ion for conv2d_cudnn to use packed parameters" Summary: The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. [ghstack-poisoned]
…v2d_cudnn to use packed parameters Summary: The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. ghstack-source-id: 51c545a Pull Request resolved: #73510
…ion for conv2d_cudnn to use packed parameters" Summary: The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Note the following: apply_impl is now what was formerly raw_cudnn_convolution_forward apply_impl_helper is now what was formerly raw_cudnn_convolution_forward_out Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. [ghstack-poisoned]
…v2d_cudnn to use packed parameters Summary: The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Note the following: apply_impl is now what was formerly raw_cudnn_convolution_forward apply_impl_helper is now what was formerly raw_cudnn_convolution_forward_out Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. ghstack-source-id: 9f024b0 Pull Request resolved: #73510
|
@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
2 similar comments
|
@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
|
@dzdang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
…ion for conv2d_cudnn to use packed parameters" Summary: The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Note the following: apply_impl is now what was formerly raw_cudnn_convolution_forward apply_impl_helper is now what was formerly raw_cudnn_convolution_forward_out Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) [ghstack-poisoned]
…ion for conv2d_cudnn to use packed parameters" Summary: The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Note the following: apply_impl is now what was formerly raw_cudnn_convolution_forward apply_impl_helper is now what was formerly raw_cudnn_convolution_forward_out Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) [ghstack-poisoned]
…ion for conv2d_cudnn to use packed parameters" Summary: The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Note the following: apply_impl is now what was formerly raw_cudnn_convolution_forward apply_impl_helper is now what was formerly raw_cudnn_convolution_forward_out Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) [ghstack-poisoned]
…for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. [ghstack-poisoned]
…cudnn to use packed parameters (Reland PR#73510) Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. ghstack-source-id: a5a2ffe Pull Request resolved: #74212
…cudnn to use packed parameters (Reland PR#73510) Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. ghstack-source-id: ee9497a Pull Request resolved: #74212
…for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. [ghstack-poisoned]
…cudnn to use packed parameters (Reland PR#73510) Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. [ghstack-poisoned]
…cudnn to use packed parameters (Reland PR#73510) Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. ghstack-source-id: d37922b Pull Request resolved: #74220
…for conv2d_cudnn to use packed parameters" Summary: The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Note the following: apply_impl is now what was formerly raw_cudnn_convolution_forward apply_impl_helper is now what was formerly raw_cudnn_convolution_forward_out Test plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. Differential Revision: [D34803275](https://our.internmc.facebook.com/intern/diff/D34803275) [ghstack-poisoned]
…cudnn to use packed parameters (#73510) Summary: Pull Request resolved: #73510 The previous implementation introduced in #70622 and expanded on in #72770, #73035, #73337 did not make use of packed parameters. This PR refactors the existing implementation to use packed parameters for cudnn conv2d in the same manner as was done for qnnpack and fbgemm in the following files: aten/src/ATen/native/quantized/cpu/fbgemm_utils.h. aten/src/ATen/native/quantized/cpu/qnnpack_utils.h. aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp. aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be refactored into two files (one located in /quantized/ and the other in /quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced in this file for the cudnn operator as well) This allows for all cudnn operators to be registered as quantized::conv2d, quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack). Test cases were also modified to adhere to the methodology of using prepacking the weight & bias prior to passing it into the conv2d operator. We also ensured that the refactorization did not result in a reduction in speed by verifying that the computation times in the benchmark test case (see test plan below) are consistent with the results pre-refactorization. Note the following: apply_impl is now what was formerly raw_cudnn_convolution_forward apply_impl_helper is now what was formerly raw_cudnn_convolution_forward_out Test Plan: In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. In pytorch main directory, execute ``` python test/test_quantization.py TestQuantizedConv.test_qconv2d_cudnn ``` for accuracy testing and ``` python test/test_quantization.py TestQuantizedConv.test_benchmark ``` for benchmark testing. Reviewed By: jerryzh168 Pulled By: dzdang (cherry picked from commit 03d9e68) ghstack-source-id: a6b7c04
… implementation for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. We also renamed cudnnpack_utils.h to utils.h. Differential Revision: [D34886988](https://our.internmc.facebook.com/intern/diff/D34886988) [ghstack-poisoned]
…for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. We also renamed cudnnpack_utils.h to utils.h. Differential Revision: [D34886988](https://our.internmc.facebook.com/intern/diff/D34886988) [ghstack-poisoned]
… implementation for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. We also renamed cudnnpack_utils.h to utils.h. Differential Revision: [D34886988](https://our.internmc.facebook.com/intern/diff/D34886988) [ghstack-poisoned]
…for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. We also renamed cudnnpack_utils.h to utils.h. Differential Revision: [D34886988](https://our.internmc.facebook.com/intern/diff/D34886988) [ghstack-poisoned]
… implementation for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. We also renamed cudnnpack_utils.h to utils.h. Differential Revision: [D34886988](https://our.internmc.facebook.com/intern/diff/D34886988) [ghstack-poisoned]
…for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. We also renamed cudnnpack_utils.h to utils.h. Differential Revision: [D34886988](https://our.internmc.facebook.com/intern/diff/D34886988) [ghstack-poisoned]
… implementation for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. We also renamed cudnnpack_utils.h to utils.h. Differential Revision: [D34886988](https://our.internmc.facebook.com/intern/diff/D34886988) [ghstack-poisoned]
…for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. We also renamed cudnnpack_utils.h to utils.h. Differential Revision: [D34886988](https://our.internmc.facebook.com/intern/diff/D34886988) [ghstack-poisoned]
…cudnn to use packed parameters (Reland PR#73510) (#74220) Summary: Pull Request resolved: #74220 This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. Test Plan: Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34886988 Pulled By: dzdang fbshipit-source-id: 9bdedcd6686956070cd1da8e76b69c0c76f34403
…cudnn to use packed parameters (Reland PR#73510) (#74220) Summary: Pull Request resolved: #74220 This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. Test Plan: Imported from OSS Reviewed By: jerryzh168 Differential Revision: D34886988 Pulled By: dzdang fbshipit-source-id: 9bdedcd6686956070cd1da8e76b69c0c76f34403 (cherry picked from commit 2a047a6)
…cudnn to use packed parameters (Reland PR#73510) (#74220) Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. [ghstack-poisoned]
…for conv2d_cudnn to use packed parameters (Reland PR#73510) (#74220)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. [ghstack-poisoned]
… implementation for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. Differential Revision: [D34934630](https://our.internmc.facebook.com/intern/diff/D34934630) [ghstack-poisoned]
…for conv2d_cudnn to use packed parameters (Reland PR#73510)" Summary: This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. Differential Revision: [D34934630](https://our.internmc.facebook.com/intern/diff/D34934630) [ghstack-poisoned]
…cudnn to use packed parameters (Reland PR#73510) (#74318) Summary: Pull Request resolved: #74318 This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. Differential Revision: D34934630 D34934630 Test Plan: Imported from OSS Reviewed By: jerryzh168 Pulled By: dzdang fbshipit-source-id: 04bdc7c4123dc60e82504b032dbde484427dda93
…cudnn to use packed parameters (Reland PR#73510) (#74318) Summary: Pull Request resolved: #74318 This a reland of #73510 -- please look at this PR directly for a summary and test plan. The only change in this PR is we add the ops to check_forward_backward_compatibility.py to get around the backwards compatibility issue introduced in the previous PR that changes the name of the cudnn conv operations. Differential Revision: D34934630 D34934630 Test Plan: Imported from OSS Reviewed By: jerryzh168 Pulled By: dzdang fbshipit-source-id: 04bdc7c4123dc60e82504b032dbde484427dda93 (cherry picked from commit 7bf6729)
Stack from ghstack (oldest at bottom):
Summary:
The previous implementation introduced in #70622
and expanded on in #72770,
#73035, #73337
did not make use of packed parameters. This PR refactors the existing
implementation to use packed parameters for cudnn conv2d in the same manner
as was done for qnnpack and fbgemm in the following files:
aten/src/ATen/native/quantized/cpu/fbgemm_utils.h.
aten/src/ATen/native/quantized/cpu/qnnpack_utils.h.
aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp.
aten/src/ATen/native/quantized/cpu/qconv_unpack.cpp (note this file will be
refactored into two files (one located in /quantized/ and the other in
/quantized/cpu/) in a subsequent PR, as we are currently using the dispatch introduced
in this file for the cudnn operator as well)
This allows for all cudnn operators to be registered as quantized::conv2d,
quantized::conv2d_relu, quantized::conv2d_prepack, and to allow the dispatcher
to determine which backend to use (e.g., cuda/cudnn, fbgemm, or qnnpack).
Test cases were also modified to adhere to the methodology of using
prepacking the weight & bias prior to passing it into the conv2d operator.
We also ensured that the refactorization did not result in a reduction in speed
by verifying that the computation times in the benchmark test case (see test plan below)
are consistent with the results pre-refactorization.
Note the following:
apply_impl is now what was formerly raw_cudnn_convolution_forward
apply_impl_helper is now what was formerly raw_cudnn_convolution_forward_out
Test plan:
In pytorch main directory, execute
for accuracy testing and
for benchmark testing.
Differential Revision: D34803275