[quant][graphmode] quant_fusion for all methods that is called #26078

jerryzh168 · 2019-09-12T02:08:36Z

Stack from ghstack:

[quant][graphmode] quant_fusion for all methods that is called #26078 [quant][graphmode] quant_fusion for all methods that is called
[quant][graphmode] use whitelist for selecting observed values #25974 [quant][graphmode] use whitelist for selecting observed values
[quant][graphmode] Fold quantize op into module #25625 [quant][graphmode] Fold quantize op into module
[quant][graphmode] Add fusion for quantized linear #25624 [quant][graphmode] Add fusion for quantized linear
Port fuse_linear from pytorch/tvm #25623 Port fuse_linear from pytorch/tvm
[quant][graphmode] Support quantizing any methods called #25505 [quant][graphmode] Support quantizing any methods called
[quant][graphmode] Skip inserting duplicate observers #25504 [quant][graphmode] Skip inserting duplicate observers

Summary:
Add a QuantFusion to recursively fuse all the graphs that is invoked by
the method

Test Plan:
python test/test_jit.py 'TestJit.test_quant_fusion_module'

Reviewers:
pt1quant
Subscribers:

Tasks:

Tags:

Differential Revision: D17348985

Summary: Add a QuantFusion to recursively fuse all the graphs that is invoked by the method Test Plan: python test/test_jit.py 'TestJit.test_quant_fusion_module' Reviewers: pt1quant Subscribers: Tasks: Tags:

Summary: Add a QuantFusion to recursively fuse all the graphs that is invoked by the method Test Plan: python test/test_jit.py 'TestJit.test_quant_fusion_module' Reviewers: pt1quant Subscribers: Tasks: Tags: ghstack-source-id: 1dbd956 Pull Request resolved: #26078

ZolotukhinM · 2019-09-12T04:12:18Z

In the end the fusion will happen in one of two places:

in graph executor along with other optimizations: at that point we won't have modules (and in fact we won't need them) - operating on a graph scope would work just fine as it is something we would execute.
in mobile workflow before serialization - in that case we want to run fusion in advance, but I think we would like to run them everywhere, not just in the method called by other methods - that would help us to use less operators in the end and potentially we could reduce the set of operators we need to deploy to mobile.

In neither of these scenarios we would be walking call-methods to find graphs to optimize, so I don't think this PR is the right approach.

jerryzh168 · 2019-09-12T18:18:17Z

it depends on the deployment flow, fusion is backend specific, let's say if we have some accelerators that can consume some form of jit graph with no jit runtime, then it is useful to fuse these graphs beforehand. I'm not sure what is the plan about graph executor and these fusion passes, should they always to coupled in the long run?

…lled" Summary: Add a QuantFusion to recursively fuse all the graphs that is invoked by the method Test Plan: python test/test_jit.py 'TestJit.test_quant_fusion_module' Reviewers: pt1quant Subscribers: Tasks: Tags:

Summary: Add a QuantFusion to recursively fuse all the graphs that is invoked by the method Test Plan: python test/test_jit.py 'TestJit.test_quant_fusion_module' Reviewers: pt1quant Subscribers: Tasks: Tags: ghstack-source-id: f2c1c9b Pull Request resolved: #26078

jerryzh168 · 2019-09-12T18:57:24Z

this is not needed since for fbgemm we'll integrate with graph executor, and for other backend we'll need to call QuantFusion(Graph) in the export flow. this as an independent function is not really used.

ZolotukhinM · 2019-09-12T19:16:29Z

it depends on the deployment flow, fusion is backend specific, let's say if we have some accelerators that can consume some form of jit graph with no jit runtime, then it is useful to fuse these graphs beforehand. I'm not sure what is the plan about graph executor and these fusion passes, should they always to coupled in the long run?

Fusion will always be a part of optimization pipeline - it could be either just-in-time optimization in graph executor, or ahead-of-time optimization for exporting to mobile/accelerator. In either case it's an always-valid transform and it will be run along with other general optimizations on all methods on all submodules. We don't need to invent the optimization pipeline ourselves here, having a function that optimizes on a graph scope should suffice now and in future.

[quant][graphmode] quant_fusion for all methods that is called

cf1665f

Summary: Add a QuantFusion to recursively fuse all the graphs that is invoked by the method Test Plan: python test/test_jit.py 'TestJit.test_quant_fusion_module' Reviewers: pt1quant Subscribers: Tasks: Tags:

jerryzh168 requested a review from apaszke as a code owner September 12, 2019 02:08

pytorchbot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Sep 12, 2019

jerryzh168 closed this Sep 12, 2019

facebook-github-bot deleted the gh/jerryzh168/60/head branch October 28, 2019 22:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[quant][graphmode] quant_fusion for all methods that is called #26078

[quant][graphmode] quant_fusion for all methods that is called #26078

Uh oh!

jerryzh168 commented Sep 12, 2019 •

edited

Loading

Uh oh!

ZolotukhinM commented Sep 12, 2019

Uh oh!

jerryzh168 commented Sep 12, 2019

Uh oh!

jerryzh168 commented Sep 12, 2019

Uh oh!

ZolotukhinM commented Sep 12, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[quant][graphmode] quant_fusion for all methods that is called #26078

[quant][graphmode] quant_fusion for all methods that is called #26078

Uh oh!

Conversation

jerryzh168 commented Sep 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ZolotukhinM commented Sep 12, 2019

Uh oh!

jerryzh168 commented Sep 12, 2019

Uh oh!

jerryzh168 commented Sep 12, 2019

Uh oh!

ZolotukhinM commented Sep 12, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jerryzh168 commented Sep 12, 2019 •

edited

Loading