-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[quant][pyper] make embedding_bag quantization static #44008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: embedding_bag requires only quantization of weights (no dynamic quantization of inputs) So the type of quantization is essentially static (without calibration) This will enable pyper to do fc and embedding_bag quantization using the same API call Test Plan: python test/test_quantization.py test_embedding_bag Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: embedding_bag requires only quantization of weights (no dynamic quantization of inputs) So the type of quantization is essentially static (without calibration) This will enable pyper to do fc and embedding_bag quantization using the same API call Test Plan: python test/test_quantization.py test_embedding_bag Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 8d24934 Pull Request resolved: #44008
💊 CI failures summary and remediationsAs of commit 24789a1 (more details on the Dr. CI page):
🚧 1 fixed upstream failure:These were probably caused by upstream breakages that were already fixed.
Please rebase on the
|
Summary: embedding_bag requires only quantization of weights (no dynamic quantization of inputs) So the type of quantization is essentially static (without calibration) This will enable pyper to do fc and embedding_bag quantization using the same API call Test Plan: python test/test_quantization.py test_embedding_bag Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23467019](https://our.internmc.facebook.com/intern/diff/D23467019) [ghstack-poisoned]
Summary: embedding_bag requires only quantization of weights (no dynamic quantization of inputs) So the type of quantization is essentially static (without calibration) This will enable pyper to do fc and embedding_bag quantization using the same API call Test Plan: python test/test_quantization.py test_embedding_bag Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23467019](https://our.internmc.facebook.com/intern/diff/D23467019) [ghstack-poisoned]
Codecov Report
@@ Coverage Diff @@
## gh/supriyar/171/base #44008 +/- ##
========================================================
- Coverage 69.35% 69.26% -0.09%
========================================================
Files 381 381
Lines 47313 47239 -74
========================================================
- Hits 32812 32722 -90
- Misses 14501 14517 +16
Continue to review full report at Codecov.
|
| int8_qconfig = QConfig(activation=PlaceholderObserver.with_args(dtype=torch.float, | ||
| custom_op_name="embedding_bag_byte"), | ||
| weight=PlaceholderObserver.with_args(custom_op_name="embedding_bag_byte")) | ||
| m = prepare_jit(m, {'embedding1' : int4_qconfig, 'embedding2' : int8_qconfig}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about eager mode? Currently we expose embedding bag as a module in torch/nn/quantized/dynamic. Should we change that too?. Strictly speaking this case straddles the boundary of static vs dynamic: Output activations are in fp32 (like dynamic). Inputs are addresses and require no quantization. Weights alone are quantized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about eager mode. Currently for static quant we expect the users to provide calibration fn when they call quantize. Weight only quantization shouldn't require that step. So in that sense it fits better into dynamic.
Summary: embedding_bag requires only quantization of weights (no dynamic quantization of inputs) So the type of quantization is essentially static (without calibration) This will enable pyper to do fc and embedding_bag quantization using the same API call Test Plan: python test/test_quantization.py test_embedding_bag Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23467019](https://our.internmc.facebook.com/intern/diff/D23467019) [ghstack-poisoned]
|
This pull request has been merged in 164b96c. |
Stack from ghstack:
Summary:
embedding_bag requires only quantization of weights (no dynamic quantization of inputs)
So the type of quantization is essentially static (without calibration)
This will enable pyper to do fc and embedding_bag quantization using the same API call
Test Plan:
python test/test_quantization.py test_embedding_bag
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D23467019