-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[quant][graphmode] quant_fusion for all methods that is called #26078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: Add a QuantFusion to recursively fuse all the graphs that is invoked by the method Test Plan: python test/test_jit.py 'TestJit.test_quant_fusion_module' Reviewers: pt1quant Subscribers: Tasks: Tags:
|
In the end the fusion will happen in one of two places:
In neither of these scenarios we would be walking call-methods to find graphs to optimize, so I don't think this PR is the right approach. |
|
it depends on the deployment flow, fusion is backend specific, let's say if we have some accelerators that can consume some form of jit graph with no jit runtime, then it is useful to fuse these graphs beforehand. I'm not sure what is the plan about graph executor and these fusion passes, should they always to coupled in the long run? |
…lled" Summary: Add a QuantFusion to recursively fuse all the graphs that is invoked by the method Test Plan: python test/test_jit.py 'TestJit.test_quant_fusion_module' Reviewers: pt1quant Subscribers: Tasks: Tags:
|
this is not needed since for fbgemm we'll integrate with graph executor, and for other backend we'll need to call QuantFusion(Graph) in the export flow. this as an independent function is not really used. |
Fusion will always be a part of optimization pipeline - it could be either just-in-time optimization in graph executor, or ahead-of-time optimization for exporting to mobile/accelerator. In either case it's an always-valid transform and it will be run along with other general optimizations on all methods on all submodules. We don't need to invent the optimization pipeline ourselves here, having a function that optimizes on a graph scope should suffice now and in future. |
Stack from ghstack:
Summary:
Add a QuantFusion to recursively fuse all the graphs that is invoked by
the method
Test Plan:
python test/test_jit.py 'TestJit.test_quant_fusion_module'
Reviewers:
pt1quant
Subscribers:
Tasks:
Tags:
Differential Revision: D17348985