-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[Quant] Add fused LinearTanh module for onednn backend #88923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/88923
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit ab6b724: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
jerryzh168
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good overall, but probably need some update after the previous PRs are updated I think
jgong5
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we need to rename the files to something like linear_activation from linear_relu since we added linear+tanh into it.
Good idea. Thanks. I will do that after clarifications from Jerry about PT 2.0. |
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
**Summary** This PR adds fused `QLinearTanh` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule [ghstack-poisoned]
|
Hi @jerryzh168 I have done changes per your comments. Could you take a look again? Thanks! |
|
Hi @jerryzh168. Is it ok to land this? Thanks |
**Summary** This PR adds fused `QLinearTanh` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]
|
Hi @jerryzh168 Do you have more comments on this PR? Thanks! |
torch/nn/intrinsic/__init__.py
Outdated
| from torch.ao.nn.intrinsic import LinearLeakyReLU | ||
| from torch.ao.nn.intrinsic import LinearTanh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please revert these, we don't want to add new things to torch/nn/intrinsic folder, this is a folder that we plan to deprecate in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok they are reverted.
jerryzh168
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please revert changes to torch/nn/intrinsic folder
**Summary** This PR adds fused `QLinearTanh` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]
**Summary** This PR adds fused `QLinearTanh` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]
Ok they are reverted. |
**Summary** This PR adds fused `QLinearTanh` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]
**Summary** This PR adds fused `QLinearTanh` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]
test/allowlist_for_publicAPI.json
Outdated
| "torch.nn.intrinsic.quantized.modules.bn_relu": "torch.ao.nn.intrinsic.quantized.modules.bn_relu", | ||
| "torch.nn.intrinsic.quantized.modules.conv_relu": "torch.ao.nn.intrinsic.quantized.modules.conv_relu", | ||
| "torch.nn.intrinsic.quantized.modules.linear_relu": "torch.ao.nn.intrinsic.quantized.modules.linear_relu", | ||
| "torch.nn.intrinsic.quantized.modules.linear_activation": "torch.ao.nn.intrinsic.quantized.modules.linear_activation", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this change still valid? we don't have torch.nn.intrinsic.quantized.modules.linear_activation now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still needed whereas the file name is changed:
"torch.nn.intrinsic.quantized.modules.linear_relu": "torch.ao.nn.intrinsic.quantized.modules.linear_activation",
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted the file name change due to CI checks failure.
**Summary** This PR adds fused `QLinearTanh` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]
**Summary** This PR adds fused `QLinearTanh` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]
**Summary** This PR adds fused `QLinearTanh` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]
**Summary** This PR adds fused `QLinearTanh` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]
**Summary** This PR adds fused `QLinearTanh` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]
I reverted the change due to CI check failure. |
**Summary** This PR adds fused `QLinearTanh` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]
**Summary** This PR adds fused `QLinearTanh` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]
**Summary** This PR adds fused `QLinearTanh` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]
**Summary** This PR adds fused `QLinearTanh` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]
**Summary** This PR adds fused `QLinearTanh` module for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown. **Test plan** python test_quantization.py TestStaticQuantizedModule cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 [ghstack-poisoned]
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Stack from ghstack (oldest at bottom):
Summary
This PR adds fused
QLinearTanhmodule for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this module with other quantization backends otherwise an error is thrown.Test plan
python test_quantization.py TestStaticQuantizedModule
cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10