-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[quant][fx] Remove input_output_observed from BinaryOpQuantizeHandler #74776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: when both inputs are scalars, fx tracing will directly calculate the result, instead of generating an op in the fx graph so num_tensor_args will always be greater than 1 for binary ops, so the input_output_observed will always return True for BinaryQuantizeHandler We will remove input_output_observed method after dynamic quantization in qconfig is properly supported Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: when both inputs are scalars, fx tracing will directly calculate the result, instead of generating an op in the fx graph so num_tensor_args will always be greater than 1 for binary ops, so the input_output_observed will always return True for BinaryQuantizeHandler We will remove input_output_observed method after dynamic quantization in qconfig is properly supported Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 814441c Pull Request resolved: #74776
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit c6c16d1 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
|
@jerryzh168 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
…#74776) Summary: Pull Request resolved: #74776 when both inputs are scalars, fx tracing will directly calculate the result, instead of generating an op in the fx graph so num_tensor_args will always be greater than 1 for binary ops, so the input_output_observed will always return True for BinaryQuantizeHandler We will remove input_output_observed method after dynamic quantization in qconfig is properly supported Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: albanD Differential Revision: D35153531 fbshipit-source-id: fa777429eeb64a6a78a98f8d8dcd9e0903c8b209
|
Hey @jerryzh168. |
Stack from ghstack (oldest at bottom):
Summary:
when both inputs are scalars, fx tracing will directly calculate the result, instead of generating an op in the fx graph
so num_tensor_args will always be greater than 1 for binary ops, so the input_output_observed will always return True
for BinaryQuantizeHandler
We will remove input_output_observed method after dynamic quantization in qconfig is properly supported
Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D35153531