-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[quant] Create nn.quantized.dynamic.EmbeddingBag #43088
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: Create quantized module that the user can use to perform embedding bag quantization The module uses the EmbeddingPackedParams to store the weights which can be serialized /deserialized using TorchBind custom classes (C++ get/setstate code) Following PR will add support for `from_float` to convert from float to quantized module Test Plan: python test/test_quantization.py TestDynamicQuantizedModule.test_embedding_bag_api Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Create quantized module that the user can use to perform embedding bag quantization The module uses the EmbeddingPackedParams to store the weights which can be serialized /deserialized using TorchBind custom classes (C++ get/setstate code) Following PR will add support for `from_float` to convert from float to quantized module Test Plan: python test/test_quantization.py TestDynamicQuantizedModule.test_embedding_bag_api Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 531e030 Pull Request resolved: #43088
💊 CI failures summary and remediationsAs of commit 5f44c98 (more details on the Dr. CI page): ✅ None of the CI failures appear to be your fault 💚
🚧 1 ongoing upstream failure:These were probably caused by upstream breakages that are not fixed yet:
This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 17 times. |
Summary: Create quantized module that the user can use to perform embedding bag quantization The module uses the EmbeddingPackedParams to store the weights which can be serialized /deserialized using TorchBind custom classes (C++ get/setstate code) Following PR will add support for `from_float` to convert from float to quantized module Test Plan: python test/test_quantization.py TestDynamicQuantizedModule.test_embedding_bag_api Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Create quantized module that the user can use to perform embedding bag quantization The module uses the EmbeddingPackedParams to store the weights which can be serialized /deserialized using TorchBind custom classes (C++ get/setstate code) Following PR will add support for `from_float` to convert from float to quantized module Test Plan: python test/test_quantization.py TestDynamicQuantizedModule.test_embedding_bag_api Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23167519](https://our.internmc.facebook.com/intern/diff/D23167519) [ghstack-poisoned]
vkuzo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lg!
| module_out = qemb(indices, offsets) | ||
|
|
||
| # Call the qembedding_bag operator directly | ||
| ref = torch.ops.quantized.embedding_bag_byte(w_packed, indices, offsets, mode=0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional: do we need to check other modes as well? Not sure if that's not needed / done elsewhere / etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently don't have support for 4-bit qtensor. Once we add that then we can repeat this check for that op as well.
torch/_utils.py
Outdated
| _, scale, zero_point = quantizer_params | ||
| tensor = torch._empty_affine_quantized(size, scale=scale, zero_point=zero_point, dtype=storage.dtype) | ||
| elif qscheme == torch.per_channel_affine: | ||
| elif qscheme == torch.per_channel_affine or qscheme == torch.per_channel_affine_float_qparams: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: qscheme in (a, b) ?
| else: | ||
| raise RuntimeError('Unsupported dtype on dynamic quantized embedding_bag!') | ||
|
|
||
| def forward(self, x): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this just to satisfy the contract of nn.Module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
|
|
||
| if _weight is None: | ||
| scales = torch.ones(num_embeddings, dtype=torch.float) | ||
| zero_points = torch.ones(num_embeddings, dtype=torch.float) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional: should this be zeros?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case it should be fine because we compute and store the bias as zero_points * scale * -1 so keeping it non-zero just makes sure we test pre-pack of some actual values.
torch/tensor.py
Outdated
| self.q_scale(), | ||
| self.q_zero_point()) | ||
| elif self.qscheme() == torch.per_channel_affine: | ||
| elif self.qscheme() == torch.per_channel_affine or self.qscheme() == torch.per_channel_affine_float_qparams: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
Summary: Create quantized module that the user can use to perform embedding bag quantization The module uses the EmbeddingPackedParams to store the weights which can be serialized /deserialized using TorchBind custom classes (C++ get/setstate code) Following PR will add support for `from_float` to convert from float to quantized module Test Plan: python test/test_quantization.py TestDynamicQuantizedModule.test_embedding_bag_api Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23167519](https://our.internmc.facebook.com/intern/diff/D23167519) [ghstack-poisoned]
Summary: Create quantized module that the user can use to perform embedding bag quantization The module uses the EmbeddingPackedParams to store the weights which can be serialized /deserialized using TorchBind custom classes (C++ get/setstate code) Following PR will add support for `from_float` to convert from float to quantized module Test Plan: python test/test_quantization.py TestDynamicQuantizedModule.test_embedding_bag_api Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23167519](https://our.internmc.facebook.com/intern/diff/D23167519) [ghstack-poisoned]
|
This pull request has been merged in 4db8ca1. |
Stack from ghstack:
Summary:
Create quantized module that the user can use to perform embedding bag quantization
The module uses the EmbeddingPackedParams to store the weights which can be serialized /deserialized
using TorchBind custom classes (C++ get/setstate code)
Following PR will add support for
from_floatto convert from float to quantized moduleTest Plan:
python test/test_quantization.py TestDynamicQuantizedModule.test_embedding_bag_api
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D23167519