[quant] Add Graph Mode Passes to quantize EmbeddingBag operators #41612

supriyar · 2020-07-17T22:34:23Z

Stack from ghstack:

[quant] Add Graph Mode Passes to quantize EmbeddingBag operators #41612 [quant] Add Graph Mode Passes to quantize EmbeddingBag operators

Summary:
This change adds preliminary support to quantize the EmbeddingBag operators. We currently support 4-bit and 8-bit quantization+packing of the weights.

To quantize these operators, specify the operator name in the custom_op_name field of the NoopObserver. Based on the op name (4bit or 8bit) we call the corresponding quantization functions.
Refer to the testplan for how to invoke the qconfig for the embedding_bag ops.

Future versions of this will support 4-bit and 2-bit qtensors with native support to observe and quantize it.

NB - This version assumes that the weights in the EmbeddingBag Module reside on the same device.

Test Plan:
python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D22609342

Summary: This change adds preliminary support to quantize the EmbeddingBag operators. We currently support 4-bit and 8-bit quantization+packing of the weights. To quantize these operators, specify the operator name in the `custom_op_name` field of the NoopObserver. Based on the op name (4bit or 8bit) we call the corresponding quantization functions. Refer to the testplan for how to invoke the qconfig for the embedding_bag ops. Future versions of this will support 4-bit and 2-bit qtensors with native support to observe and quantize it. NB - This version assumes that the weights in the EmbeddingBag Module reside on the same device. Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

…rators" Summary: This change adds preliminary support to quantize the EmbeddingBag operators. We currently support 4-bit and 8-bit quantization+packing of the weights. To quantize these operators, specify the operator name in the `custom_op_name` field of the NoopObserver. Based on the op name (4bit or 8bit) we call the corresponding quantization functions. Refer to the testplan for how to invoke the qconfig for the embedding_bag ops. Future versions of this will support 4-bit and 2-bit qtensors with native support to observe and quantize it. NB - This version assumes that the weights in the EmbeddingBag Module reside on the same device. Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22609342](https://our.internmc.facebook.com/intern/diff/D22609342) [ghstack-poisoned]

dr-ci · 2020-07-17T22:43:31Z

💊 CI failures summary and remediations

As of commit 3210bd9 (more details on the Dr. CI page):

1/1 failures possibly* introduced in this PR
- 1/1 non-CircleCI failure(s)

Extra GitHub checks: 1 failed

Failed: GitHub Actions - clang-tidy

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 15 times.

…rators" Summary: This change adds preliminary support to quantize the EmbeddingBag operators. We currently support 4-bit and 8-bit quantization+packing of the weights. To quantize these operators, specify the operator name in the `custom_op_name` field of the NoopObserver. Based on the op name (4bit or 8bit) we call the corresponding quantization functions. Refer to the testplan for how to invoke the qconfig for the embedding_bag ops. Future versions of this will support 4-bit and 2-bit qtensors with native support to observe and quantize it. NB - This version assumes that the weights in the EmbeddingBag Module reside on the same device. Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22609342](https://our.internmc.facebook.com/intern/diff/D22609342) [ghstack-poisoned]

Summary: This change adds preliminary support to quantize the EmbeddingBag operators. We currently support 4-bit and 8-bit quantization+packing of the weights. To quantize these operators, specify the operator name in the `custom_op_name` field of the NoopObserver. Based on the op name (4bit or 8bit) we call the corresponding quantization functions. Refer to the testplan for how to invoke the qconfig for the embedding_bag ops. Future versions of this will support 4-bit and 2-bit qtensors with native support to observe and quantize it. NB - This version assumes that the weights in the EmbeddingBag Module reside on the same device. Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 5d8a834 Pull Request resolved: #41612

vkuzo

lgtm, accepting to unblock. Feel free to wait for @jerryzh168 if a deeper review on the JIT pass is needed.

test/quantization/test_quantize_jit.py

torch/csrc/jit/passes/quantization/helper.h

torch/csrc/jit/passes/quantization/insert_quant_dequant.cpp

vkuzo · 2020-07-22T15:23:57Z

test/quantization/test_quantize_jit.py

+        offsets = torch.tensor([0, 19, 20, 28, 28, 32])
+
+        from torch.quantization import QConfigDynamic, NoopObserver
+        int4_dynamic_qconfig = QConfigDynamic(activation=NoopObserver.with_args(custom_op_name="embedding_bag_4bit"),


makes sense that custom_op_name is a temporary solution. Do we have thoughts on what to replace it with eventually / why not now? Would it be EmbeddingBag{8|4|2}BitObserver / something else?

what's the longer term solution? are we planning to add torch.qint4, torch.qint2 etc.?

yes, long term this will be replaced with observer that can support torch.qint4 and torch.qint2

torch/csrc/jit/passes/quantization/helper.h

torch/csrc/jit/passes/quantization/helper.cpp

jerryzh168 · 2020-07-22T16:34:56Z

torch/csrc/jit/passes/quantization/insert_quant_dequant.cpp

+  auto observer_module = module.attr(findObserverName(v).value()).toModule();
+  if (observer_module.hasattr("custom_op")) {
+    auto op_name = observer_module.attr("custom_op").toStringRef();
+    return isNoopObserver(observer) ? op_name : "";


follow up PR: since NoopObserver is special, probably better to add a "_" prefix to reserve this for internal use.

torch/csrc/jit/passes/quantization/insert_quant_dequant.cpp

jerryzh168 · 2020-07-22T18:31:42Z

torch/csrc/jit/passes/quantization/insert_quant_dequant.cpp

+  }
+  // Insert prepack op
+  Node* prepack = g->create(Symbol::fromQualString(prepack_fn), prepack_inputs);
+  g->insertNode(prepack);


could you also add a WithInsertPoint? we want to insert before the use node I think.

should just be: WithInsertPoint ins(embedding_bag_loat_op);

We've already added the insert point in insertQuantizationOps at the output of the observer node.

jerryzh168 · 2020-07-22T18:57:31Z

torch/csrc/jit/passes/quantization/insert_quant_dequant.cpp

+  for (const Use& use : uses) {
+    if (matchCallFuncToUse(use, "embedding_bag", 2)) {
+      embedding_bag_float_op = use.user;
+    }
+  }


what are the possible cases? is the observer_out always going to be used by an embedding_bag op here?

…rators" Summary: This change adds preliminary support to quantize the EmbeddingBag operators. We currently support 4-bit and 8-bit quantization+packing of the weights. To quantize these operators, specify the operator name in the `custom_op_name` field of the NoopObserver. Based on the op name (4bit or 8bit) we call the corresponding quantization functions. Refer to the testplan for how to invoke the qconfig for the embedding_bag ops. Future versions of this will support 4-bit and 2-bit qtensors with native support to observe and quantize it. NB - This version assumes that the weights in the EmbeddingBag Module reside on the same device. Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22609342](https://our.internmc.facebook.com/intern/diff/D22609342) [ghstack-poisoned]

Summary: This change adds preliminary support to quantize the EmbeddingBag operators. We currently support 4-bit and 8-bit quantization+packing of the weights. To quantize these operators, specify the operator name in the `custom_op_name` field of the NoopObserver. Based on the op name (4bit or 8bit) we call the corresponding quantization functions. Refer to the testplan for how to invoke the qconfig for the embedding_bag ops. Future versions of this will support 4-bit and 2-bit qtensors with native support to observe and quantize it. NB - This version assumes that the weights in the EmbeddingBag Module reside on the same device. Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 845f2a0 Pull Request resolved: #41612

…rators" Summary: This change adds preliminary support to quantize the EmbeddingBag operators. We currently support 4-bit and 8-bit quantization+packing of the weights. To quantize these operators, specify the operator name in the `custom_op_name` field of the NoopObserver. Based on the op name (4bit or 8bit) we call the corresponding quantization functions. Refer to the testplan for how to invoke the qconfig for the embedding_bag ops. Future versions of this will support 4-bit and 2-bit qtensors with native support to observe and quantize it. NB - This version assumes that the weights in the EmbeddingBag Module reside on the same device. Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22609342](https://our.internmc.facebook.com/intern/diff/D22609342) [ghstack-poisoned]

Summary: This change adds preliminary support to quantize the EmbeddingBag operators. We currently support 4-bit and 8-bit quantization+packing of the weights. To quantize these operators, specify the operator name in the `custom_op_name` field of the NoopObserver. Based on the op name (4bit or 8bit) we call the corresponding quantization functions. Refer to the testplan for how to invoke the qconfig for the embedding_bag ops. Future versions of this will support 4-bit and 2-bit qtensors with native support to observe and quantize it. NB - This version assumes that the weights in the EmbeddingBag Module reside on the same device. Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 76a8c2a Pull Request resolved: #41612

…rators" Summary: This change adds preliminary support to quantize the EmbeddingBag operators. We currently support 4-bit and 8-bit quantization+packing of the weights. To quantize these operators, specify the operator name in the `custom_op_name` field of the NoopObserver. Based on the op name (4bit or 8bit) we call the corresponding quantization functions. Refer to the testplan for how to invoke the qconfig for the embedding_bag ops. Future versions of this will support 4-bit and 2-bit qtensors with native support to observe and quantize it. NB - This version assumes that the weights in the EmbeddingBag Module reside on the same device. Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22609342](https://our.internmc.facebook.com/intern/diff/D22609342) [ghstack-poisoned]

jerryzh168 · 2020-07-23T18:50:31Z

torch/csrc/jit/passes/quantization/insert_quant_dequant.cpp

-      inputs.push_back(g->insertGetAttr(self, qparam_name));
+  // Temporary solution to quantize embedding_bag operators.
+  auto embedding_bag_name = getEmbeddingBagObsName(module, observer);
+  if (quant_type == QuantType::DYNAMIC && embedding_bag_name &&


can you merge this branch with the one lin L399 now?

I prefer keeping it separate since this is a special case and this code will be removed in the future. Wanted to make that a little more obvious :)

might also be good to have a isEmbeddingBagOp function to be consistent with isFP16NoopObserver

I see, sure. do you mean we plan to remove the swapping of input and weight in the future in the embedding bag module? I think this will be needed in graph mode if that does not change

jerryzh168 · 2020-07-23T18:52:32Z

torch/csrc/jit/passes/quantization/insert_quant_dequant.cpp

+      observer_out->replaceAllUsesWith(original_val);
+      original_val->replaceAllUsesAfterNodeWith(dequant, dequant->output());


these two lines are the same as 417-419

torch/csrc/jit/passes/quantization/insert_quant_dequant.cpp

jerryzh168

Looks good! thanks. had a few more inline comments.

jerryzh168 · 2020-07-23T19:03:20Z

torch/csrc/jit/passes/quantization/insert_quant_dequant.cpp

 }

+// find the observer for Value `v` and return the name of the observer
+c10::optional<std::string> findObserverName(Value* v) {


btw we can check for types now I think

…rators" Summary: This change adds preliminary support to quantize the EmbeddingBag operators. We currently support 4-bit and 8-bit quantization+packing of the weights. To quantize these operators, specify the operator name in the `custom_op_name` field of the NoopObserver. Based on the op name (4bit or 8bit) we call the corresponding quantization functions. Refer to the testplan for how to invoke the qconfig for the embedding_bag ops. Future versions of this will support 4-bit and 2-bit qtensors with native support to observe and quantize it. NB - This version assumes that the weights in the EmbeddingBag Module reside on the same device. Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D22609342](https://our.internmc.facebook.com/intern/diff/D22609342) [ghstack-poisoned]

Summary: This change adds preliminary support to quantize the EmbeddingBag operators. We currently support 4-bit and 8-bit quantization+packing of the weights. To quantize these operators, specify the operator name in the `custom_op_name` field of the NoopObserver. Based on the op name (4bit or 8bit) we call the corresponding quantization functions. Refer to the testplan for how to invoke the qconfig for the embedding_bag ops. Future versions of this will support 4-bit and 2-bit qtensors with native support to observe and quantize it. NB - This version assumes that the weights in the EmbeddingBag Module reside on the same device. Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 7928558 Pull Request resolved: #41612

facebook-github-bot · 2020-07-24T02:09:46Z

This pull request has been merged in 36fb14b.

supriyar requested a review from apaszke as a code owner July 17, 2020 22:34

facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Jul 17, 2020

supriyar requested review from jerryzh168, raghuramank100 and vkuzo July 17, 2020 22:41

vkuzo approved these changes Jul 22, 2020

View reviewed changes

jerryzh168 reviewed Jul 22, 2020

View reviewed changes

torch/csrc/jit/passes/quantization/helper.cpp Show resolved Hide resolved

jerryzh168 reviewed Jul 22, 2020

View reviewed changes

torch/csrc/jit/passes/quantization/insert_quant_dequant.cpp Outdated Show resolved Hide resolved

jerryzh168 reviewed Jul 22, 2020

View reviewed changes

torch/csrc/jit/passes/quantization/insert_quant_dequant.cpp Show resolved Hide resolved

jerryzh168 reviewed Jul 22, 2020

View reviewed changes

torch/csrc/jit/passes/quantization/insert_quant_dequant.cpp Outdated Show resolved Hide resolved

jerryzh168 reviewed Jul 22, 2020

View reviewed changes

torch/csrc/jit/passes/quantization/insert_quant_dequant.cpp Show resolved Hide resolved

jerryzh168 reviewed Jul 22, 2020

View reviewed changes

jerryzh168 reviewed Jul 23, 2020

View reviewed changes

torch/csrc/jit/passes/quantization/insert_quant_dequant.cpp Outdated Show resolved Hide resolved

jerryzh168 reviewed Jul 23, 2020

View reviewed changes

torch/csrc/jit/passes/quantization/insert_quant_dequant.cpp Outdated Show resolved Hide resolved

jerryzh168 approved these changes Jul 23, 2020

View reviewed changes

jerryzh168 reviewed Jul 23, 2020

View reviewed changes

facebook-github-bot closed this in 36fb14b Jul 24, 2020

facebook-github-bot added the merged label Jul 24, 2020

facebook-github-bot deleted the gh/supriyar/150/head branch July 27, 2020 14:18

mruberry added the Merged label Oct 28, 2020

		observer_out->replaceAllUsesWith(original_val);
		original_val->replaceAllUsesAfterNodeWith(dequant, dequant->output());

[quant] Add Graph Mode Passes to quantize EmbeddingBag operators #41612

[quant] Add Graph Mode Passes to quantize EmbeddingBag operators #41612

Uh oh!

Conversation

supriyar commented Jul 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dr-ci bot commented Jul 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

Extra GitHub checks: 1 failed

Uh oh!

vkuzo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Jul 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jerryzh168 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jul 24, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

supriyar commented Jul 17, 2020 •

edited

Loading

dr-ci bot commented Jul 17, 2020 •

edited

Loading

jerryzh168 Jul 22, 2020 •

edited

Loading