-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Refactor qconv to reduce allocations. #42007
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit d1cd111 (more details on the Dr. CI page):
Extra GitHub checks: 1 failed
ci.pytorch.org: 1 failedThis comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 77 times. |
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
| pytorch_qnnp_log_error( | ||
| "Per channel quantized weights are not supported for XZP kernels"); | ||
| assert("Failed to initialize QNNPACK conv_param_t struct."); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this cause unexpected behavior if someone uses the XZP kernels? I know we are not currently using it but it would be safer to either remove it completely or leave it functional rather than an in-between state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch. Strange I removed this. Will reinstate.
| @@ -1,4 +1,5 @@ | |||
| #include <conv_utils.h> | |||
| #include <conv_utils.h> | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant include
| input_pixel_stride, | ||
| output, | ||
| output_pixel_stride, | ||
| threadpool); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the logic removed to the left entirely contained in pytorch_qnnp_setup_convolution2d_nhwc_q8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
| const uint32_t cr = pytorch_qnnp_params.q8dw9.cr; | ||
| const size_t group_stride = (groups + (cr - 1)) & -cr; | ||
|
|
||
| if (any_padding) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why this block is removed? From 376 to 414.
| const void** indirection_buffer = (const void**)realloc( | ||
| deconvolution->indirection_buffer, indirection_buffer_size); | ||
| if (indirection_buffer == NULL) { | ||
| pytorch_qnnp_status status = pytorch_qnnp_setup_deconvolution2d_nhwc_q8( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So again, all of the logic removed to the left is replaced with pytorch_qnnp_setup_deconvolution2d_nhwc_q8?
| const bool any_padding = (conv_p.padding[0]| conv_p.padding[1] | ||
| |conv_p.padding[2] | conv_p.padding[3]) != 0; | ||
|
|
||
| if (any_padding) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I see, it is moved here.
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
Summary: zero buffer and indirection pointers are allocatoed on every iterations. With this refactor we create op once for qnnpackconv struct and keep repopulating indirection pointer as necessary. For deconv moved much of op creation outside so that we can avoid creating and destroying ops every time. Test Plan: CI quantization tests. deconvolution-test Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22726972](https://our.internmc.facebook.com/intern/diff/D22726972) [ghstack-poisoned]
|
This pull request has been merged in 04aa42a. |
Stack from ghstack:
Summary:
zero buffer and indirection pointers are allocatoed on every iterations.
With this refactor we create op once for qnnpackconv struct and keep
repopulating indirection pointer as necessary.
For deconv moved much of op creation outside so that we can avoid creating and
destroying ops every time.
Test Plan:
CI quantization tests.
deconvolution-test
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D22726972