[quant] Add quantized Embedding module #44208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

supriyar wants to merge 5 commits into gh/supriyar/174/base from gh/supriyar/174/head

Contributor

supriyar commented Sep 4, 2020 •

edited

Loading

Stack from ghstack:

[quant] Move EmbeddingBag eager quantization to static #44217 [quant] Move EmbeddingBag eager quantization to static
[quant] Add quantized Embedding module #44208 [quant] Add quantized Embedding module
[quant] Support quantization of embedding lookup operators #44207 [quant] Support quantization of embedding lookup operators

Summary:
Add quantized module in static quantization namespace. Embedding
quantization requires only weights to be quantized so it is static.
Internally it calls the embedding_bag_byte op with the offsets set corresponding to the
indices.

Future PR will move EmbeddingBag quantization from dynamic to static as well.

Test Plan:
python test/test_quantization.py test_embedding_api

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D23547384


          [quant] Add quantized Embedding module

cda5d6c

Summary:
Add quantized module in static quantization namespace. Embedding
quantization requires only weights to be quantized so it is static.
Internally it calls the embedding_bag_byte op with the offsets set corresponding to the
indices.

Future PR will move EmbeddingBag quantization from dynamic to static as well.

Test Plan:
python test/test_quantization.py test_embedding_api

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]

supriyar requested a review from apaszke as a code owner

September 4, 2020 18:06

supriyar mentioned this pull request

[quant] Support quantization of embedding lookup operators #44207

Closed

dr-ci bot commented Sep 4, 2020 •

edited

Loading

💊 CI failures summary and remediations

As of commit b4a14f7 (more details on the Dr. CI page):

1/1 failures possibly* introduced in this PR
- 1/1 non-CircleCI failure(s)

ci.pytorch.org: 1 failed

Failed: pr/pytorch-linux-bionic-rocm3.7-py3.6

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 18 times.


          Update on "[quant] Add quantized Embedding module"

21c8747

Summary:
Add quantized module in static quantization namespace. Embedding
quantization requires only weights to be quantized so it is static.
Internally it calls the embedding_bag_byte op with the offsets set corresponding to the
indices.

Future PR will move EmbeddingBag quantization from dynamic to static as well.

Test Plan:
python test/test_quantization.py test_embedding_api

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]

supriyar mentioned this pull request

[quant] Move EmbeddingBag eager quantization to static #44217

Closed

supriyar requested review from raghuramank100, vkuzo and z-a-f

September 4, 2020 20:36


          Update on "[quant] Add quantized Embedding module"

998f77c

Summary:
Add quantized module in static quantization namespace. Embedding
quantization requires only weights to be quantized so it is static.
Internally it calls the embedding_bag_byte op with the offsets set corresponding to the
indices.

Future PR will move EmbeddingBag quantization from dynamic to static as well.

Test Plan:
python test/test_quantization.py test_embedding_api

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D23547384](https://our.internmc.facebook.com/intern/diff/D23547384)

[ghstack-poisoned]

vkuzo approved these changes

View reviewed changes

torch/nn/quantized/modules/embedding.py Show resolved Hide resolved

torch/nn/quantized/modules/embedding.py

Comment on lines +103 to +104

		scales = torch.ones(num_embeddings, dtype=torch.float)
		zero_points = torch.ones(num_embeddings, dtype=torch.float)

Contributor

vkuzo Sep 5, 2020

seems like this is the same code as in packed params, perhaps we can only do it once?

raghuramank100 reviewed

View reviewed changes

torch/nn/quantized/modules/embedding.py

+                  def set_weight(self, weight):
+                      # type: (torch.Tensor) -> None
+                      if self.dtype == torch.quint8:
+                          self._packed_weight = torch.ops.quantized.embedding_bag_prepack(weight)

Contributor

raghuramank100 Sep 5, 2020

Do we support per tensor quantization for packed params?

Contributor Author

supriyar Sep 5, 2020

Not atm, we only have per-row quantization support with float qparams.

raghuramank100 reviewed

View reviewed changes

torch/nn/quantized/modules/embedding.py

+                  #   |--- _packed_weight : Tensor representing weight of EmbeddingPackedParamsBase
+                  #   |--- dtype : torch.dtype
+                  def _save_to_state_dict(self, destination, prefix, keep_vars):

Contributor

raghuramank100 Sep 5, 2020

Should we also have a field for bitwidth?

Contributor Author

supriyar Sep 5, 2020

we can use the tensor dtype to determine the bitwidth, right? Currently it only supports 8bit, but once we add 4-bit qtensor it should be encoded in the dtype.

raghuramank100 reviewed

View reviewed changes

torch/nn/quantized/modules/embedding.py

+                      super(Embedding, self).__init__()
+                      self.num_embeddings = num_embeddings
+                      self.embedding_dim = embedding_dim
+                      self.sparse = sparse

Contributor

raghuramank100 Sep 5, 2020

For my understanding: What does self.sparse do ?

Contributor Author

supriyar Sep 5, 2020

Currently it doesn't do anything for the quantized module. I'll remove it from here.
It is used in the float module for sparse gradients for weight tensor.

codecov bot commented Sep 5, 2020 •

edited

Loading

Codecov Report

Merging #44208 into gh/supriyar/174/base will increase coverage by 0.06%.
The diff coverage is 91.54%.

@@                   Coverage Diff                    @@
##           gh/supriyar/174/base   #44208      +/-   ##
========================================================
+ Coverage                 69.24%   69.31%   +0.06%     
========================================================
  Files                       381      382       +1     
  Lines                     47573    47714     +141     
========================================================
+ Hits                      32943    33072     +129     
- Misses                    14630    14642      +12

Impacted Files	Coverage Δ
torch/quantization/default_mappings.py	`100.00% <ø> (ø)`
torch/testing/_internal/common_quantization.py	`89.58% <90.56%> (+0.04%)`	⬆️
torch/nn/quantized/modules/embedding.py	`92.04% <92.04%> (ø)`
torch/nn/quantized/modules/__init__.py	`96.96% <100.00%> (+0.09%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3e137ea...b4a14f7. Read the comment docs.

supriyar added 2 commits

September 7, 2020 20:42


          Update on "[quant] Add quantized Embedding module"

48dee76

Summary:
Add quantized module in static quantization namespace. Embedding
quantization requires only weights to be quantized so it is static.
Internally it calls the embedding_bag_byte op with the offsets set corresponding to the
indices.

Future PR will move EmbeddingBag quantization from dynamic to static as well.

Test Plan:
python test/test_quantization.py test_embedding_api

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D23547384](https://our.internmc.facebook.com/intern/diff/D23547384)

[ghstack-poisoned]


          Update on "[quant] Add quantized Embedding module"

b4a14f7

Summary:
Add quantized module in static quantization namespace. Embedding
quantization requires only weights to be quantized so it is static.
Internally it calls the embedding_bag_byte op with the offsets set corresponding to the
indices.

Future PR will move EmbeddingBag quantization from dynamic to static as well.

Test Plan:
python test/test_quantization.py test_embedding_api

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D23547384](https://our.internmc.facebook.com/intern/diff/D23547384)

[ghstack-poisoned]

facebook-github-bot closed this in

57b87aa

facebook-github-bot added the merged label

Contributor

facebook-github-bot commented Sep 9, 2020

This pull request has been merged in 57b87aa.

facebook-github-bot deleted the gh/supriyar/174/head branch

September 12, 2020 14:17

mruberry added the Merged label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels