Skip to content

Conversation

@skyw
Copy link
Contributor

@skyw skyw commented Jun 9, 2020

As discussed in #39502.

This PR adds support for exporting fake_quantize_per_tensor_affine to a pair of QuantizeLinear and DequantizeLinear.

Exporting fake_quantize_per_channel_affine to ONNX depends on onnx/onnx#2772. will file another PR once ONNX merged the change.

It will generate ONNX graph like this:
image

@jamesr66a

@ngimel ngimel added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 10, 2020
@neginraoof
Copy link
Contributor

neginraoof commented Jun 11, 2020

Thanks for adding the new op! Is there an example model using the FakeQuantize op? Would it be possible to have a model tests added in test_models.py?

@skyw
Copy link
Contributor Author

skyw commented Jun 12, 2020

Thanks for adding the new op! Is there an example model using the FakeQuantize op? Would it be possible to have a model tests added in test_models.py?

Yes, I can add one to test_models.py.
But I think the test will fail because test_models.py seems to test exporting on opset 9 by default. But QuantizeLinear and DequantizeLinear is added with opset 10. @neginraoof

@dr-ci
Copy link

dr-ci bot commented Jun 12, 2020

💊 CI failures summary and remediations

As of commit 38e756b (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 40 times.

@skyw
Copy link
Contributor Author

skyw commented Jun 12, 2020

Thanks for adding the new op! Is there an example model using the FakeQuantize op? Would it be possible to have a model tests added in test_models.py?

Yes, I can add one to test_models.py.
But I think the test will fail because test_models.py seems to test exporting on opset 9 by default. But QuantizeLinear and DequantizeLinear is added with opset 10. @neginraoof

yeah, the test failed with RuntimeError: Exporting the operator fake_quantize_per_tensor_affine to ONNX opset version 9 is not supported. Support for this operator was added in version 10, try exporting with this version.
It shows a seg fault in later tests in CI. I tested on my local and got this error message. @neginraoof how should I proceed with the test? Is there a way to set using opset 10 in test_models.py? The test model works with torch.onnx.export()

@neginraoof
Copy link
Contributor

So the purpose for test_models is to include a real model that uses FakeQuantize op.
For unit tests, I think onnxruntime test should cover this op.

But if you have a test that you want to disable for lower opsets, you might be able to do that using:
the skipIfUnsupportedMinOpsetVersion decorator.

from test_pytorch_common import skipIfUnsupportedMinOpsetVersion

@skyw
Copy link
Contributor Author

skyw commented Jun 15, 2020

Added skipIfUnsupportedMinOpsetVersion.

@skyw
Copy link
Contributor Author

skyw commented Jun 15, 2020

One test time out. It doesn't seem to be related to this PR. Do I need to worry about it? Clicked rerun but nothing happened.

Copy link
Contributor

@neginraoof neginraoof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

@skyw
Copy link
Contributor Author

skyw commented Jun 18, 2020

@raghuramank100 @houseroad any comment?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A minor suggestion - for code readers, it may be better to have a RuntimeError here instead of _unimplemented as this is something not supported in ONNX itself, and not something that is uncovered in the exporter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@neginraoof suggested to use _unimplemented.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @neginraoof that it is better than ValueError that you had before. But _unimplemented is probably better suited for something that can be supported in the future. Since this is something not supported in ONNX (unless their is a spec change, which we can be considered), I feel that RuntimeError is a bit more appropriate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to RuntimeError.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add a model level test in _test_models_onnxruntime.py? It would also serve as documentation of the user scenario for this functionality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems test_models_onnxruntime.py imports from test_pytorch_onnx_onnxruntime , https://github.com/pytorch/pytorch/blob/master/test/onnx/test_models_onnxruntime.py#L10

There is a test in test_pytorch_onnx_onnxruntime

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is just the run_model_test helper method. Also, the test point in test_pytorch_onnx_onnxruntime is unit test. I was suggesting adding a larger model test in this file. Do you have any model that required this functionality? That may be a good candidate for using for the model-level test.

Copy link
Contributor Author

@skyw skyw Jun 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a test added to test_models.py and run by test_models_onnxruntime.py.

Are you suggesting a larger one? Resnet50 feels too big for a test, maybe a LeNet?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need a larger model test. We actually do have resnet50 as part of test_models.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a full qat resnet50.

@gottbrath gottbrath requested review from ngimel and vkuzo July 1, 2020 20:43
@gottbrath
Copy link
Contributor

@ngimel FYI
@raghuramank100 can you review this please?

@gottbrath gottbrath removed the request for review from ngimel July 1, 2020 20:45
@skyw
Copy link
Contributor Author

skyw commented Jul 14, 2020

Looks good. Sorry for being on this. Could you rebase to the master and retrigger the CI before landing?

rebased. CI passed

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@houseroad has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@houseroad merged this pull request in 1fb2a7e.

@facebook-github-bot
Copy link
Contributor

@houseroad merged this pull request in 1fb2a7e.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@houseroad merged this pull request in 1fb2a7e.

@LMerCy
Copy link

LMerCy commented Jul 17, 2020

Looks good. Sorry for being on this. Could you rebase to the master and retrigger the CI before landing?

rebased. CI passed

doese this feature support export quantized model to onnx? or just export to caffe2? @skyw

@skyw
Copy link
Contributor Author

skyw commented Jan 27, 2021

Looks good. Sorry for being on this. Could you rebase to the master and retrigger the CI before landing?

rebased. CI passed

doese this feature support export quantized model to onnx? or just export to caffe2? @skyw

Sorry I missed this...for 6 month
This one supports exporting fake quantized model in PYtorch to ONNX.

facebook-github-bot pushed a commit that referenced this pull request Feb 8, 2021
Summary:
Fixes #39502

This PR adds support for exporting **fake_quantize_per_channel_affine** to a pair of QuantizeLinear and DequantizeLinear. Per tensor support was added by PR #39738.

`axis` attribute of QuantizeLinear and DequantizeLinear, which is required for per channel support, is added in opset13 added by onnx/onnx#2772.

[update 1/20/2021]: opset13 is being supported on master, the added function is now properly tested. Code also rebased to new master.

The function is also tested offline with the following code
```python
import torch
from torch import quantization

from torchvision import models
qat_resnet18 = models.resnet18(pretrained=True).eval().cuda()

qat_resnet18.qconfig = quantization.QConfig(
    activation=quantization.default_fake_quant, weight=quantization.default_per_channel_weight_fake_quant)
quantization.prepare_qat(qat_resnet18, inplace=True)
qat_resnet18.apply(quantization.enable_observer)
qat_resnet18.apply(quantization.enable_fake_quant)

dummy_input = torch.randn(16, 3, 224, 224).cuda()
_ = qat_resnet18(dummy_input)
for module in qat_resnet18.modules():
    if isinstance(module, quantization.FakeQuantize):
        module.calculate_qparams()
qat_resnet18.apply(quantization.disable_observer)

qat_resnet18.cuda()

input_names = [ "actual_input_1" ]
output_names = [ "output1" ]

torch.onnx.export(qat_resnet18, dummy_input, "quant_model.onnx", verbose=True, opset_version=13)
```
It can generate the desired graph.

Pull Request resolved: #42835

Reviewed By: houseroad

Differential Revision: D26293823

Pulled By: SplitInfinity

fbshipit-source-id: 300498a2e24b7731b12fa2fbdea4e73dde80e7ea
SplitInfinity pushed a commit that referenced this pull request Feb 18, 2021
Summary:
Fixes #39502

This PR adds support for exporting **fake_quantize_per_channel_affine** to a pair of QuantizeLinear and DequantizeLinear. Per tensor support was added by PR #39738.

`axis` attribute of QuantizeLinear and DequantizeLinear, which is required for per channel support, is added in opset13 added by onnx/onnx#2772.

[update 1/20/2021]: opset13 is being supported on master, the added function is now properly tested. Code also rebased to new master.

The function is also tested offline with the following code
```python
import torch
from torch import quantization

from torchvision import models
qat_resnet18 = models.resnet18(pretrained=True).eval().cuda()

qat_resnet18.qconfig = quantization.QConfig(
    activation=quantization.default_fake_quant, weight=quantization.default_per_channel_weight_fake_quant)
quantization.prepare_qat(qat_resnet18, inplace=True)
qat_resnet18.apply(quantization.enable_observer)
qat_resnet18.apply(quantization.enable_fake_quant)

dummy_input = torch.randn(16, 3, 224, 224).cuda()
_ = qat_resnet18(dummy_input)
for module in qat_resnet18.modules():
    if isinstance(module, quantization.FakeQuantize):
        module.calculate_qparams()
qat_resnet18.apply(quantization.disable_observer)

qat_resnet18.cuda()

input_names = [ "actual_input_1" ]
output_names = [ "output1" ]

torch.onnx.export(qat_resnet18, dummy_input, "quant_model.onnx", verbose=True, opset_version=13)
```
It can generate the desired graph.

Pull Request resolved: #42835

Reviewed By: houseroad

Differential Revision: D26293823

Pulled By: SplitInfinity

fbshipit-source-id: 300498a2e24b7731b12fa2fbdea4e73dde80e7ea
malfet pushed a commit that referenced this pull request Feb 18, 2021
Summary:
Fixes #39502

This PR adds support for exporting **fake_quantize_per_channel_affine** to a pair of QuantizeLinear and DequantizeLinear. Per tensor support was added by PR #39738.

`axis` attribute of QuantizeLinear and DequantizeLinear, which is required for per channel support, is added in opset13 added by onnx/onnx#2772.

[update 1/20/2021]: opset13 is being supported on master, the added function is now properly tested. Code also rebased to new master.

The function is also tested offline with the following code
```python
import torch
from torch import quantization

from torchvision import models
qat_resnet18 = models.resnet18(pretrained=True).eval().cuda()

qat_resnet18.qconfig = quantization.QConfig(
    activation=quantization.default_fake_quant, weight=quantization.default_per_channel_weight_fake_quant)
quantization.prepare_qat(qat_resnet18, inplace=True)
qat_resnet18.apply(quantization.enable_observer)
qat_resnet18.apply(quantization.enable_fake_quant)

dummy_input = torch.randn(16, 3, 224, 224).cuda()
_ = qat_resnet18(dummy_input)
for module in qat_resnet18.modules():
    if isinstance(module, quantization.FakeQuantize):
        module.calculate_qparams()
qat_resnet18.apply(quantization.disable_observer)

qat_resnet18.cuda()

input_names = [ "actual_input_1" ]
output_names = [ "output1" ]

torch.onnx.export(qat_resnet18, dummy_input, "quant_model.onnx", verbose=True, opset_version=13)
```
It can generate the desired graph.

Pull Request resolved: #42835

Reviewed By: houseroad

Differential Revision: D26293823

Pulled By: SplitInfinity

fbshipit-source-id: 300498a2e24b7731b12fa2fbdea4e73dde80e7ea

Co-authored-by: Hao Wu <skyw@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants