Skip to content

Conversation

@vkuzo
Copy link
Contributor

@vkuzo vkuzo commented Aug 19, 2022

Stack from ghstack:

Summary:

#83658 reported that ops
followed by a certain pattern of view and size ops were not quantized
correctly by FX graph mode quantization.

Before this PR, the "size" op was in the "op shares qparams with input"
category, and the code assumed that the input of this op has the same dtype
as its output. This led to incorrectly propagating the int dtype
as the output of whichever op was preceding the view op, which in turn
made that op blocklisted from quantization.

The fix is to create a new category of ops which work on different dtypes
of tensors but are not observed. This PR does so for size, and also for
shape since it works the same way.

Test plan:

python test/test_quantization.py -k test_linear_view_size

cc @ezyang @SherlockNoMad @soumith @EikanWang @jgong5 @wenzhe-nrv

Summary:

#83658 reported that ops
followed by a certain pattern of `view` and `size` ops were not quantized
correctly by FX graph mode quantization.

Before this PR, the "size" op was in the "op shares qparams with input"
category, and the code assumed that the input of this op has the same dtype
as its output. This led to incorrectly propagating the `int` dtype
as the output of whichever op was preceding the `view` op, which in turn
made that op blocklisted from quantization.

The fix is to create a new category of ops which work on different dtypes
of tensors but are not observed.  This PR does so for `size`, and also for
`shape` since it works the same way.

Test plan:

```
python test/test_quantization.py -k test_linear_view_size
```

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Aug 19, 2022

🔗 Helpful links

❌ 1 New Failures, 1 Base Failures

As of commit d30244c (more details on the Dr. CI page):

Expand to see more

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages

See GitHub Actions build pull / linux-docs / build-docs (python) (1/1)

Step: "Unknown" (full log | diagnosis details)

2022-08-20T05:20:53.3704354Z ##[error]The operation was canceled.
2022-08-19T23:27:53.7776451Z copying images... [ 97%] _static/img/tensorboard/add_images.png
2022-08-19T23:27:53.7778082Z copying images... [100%] _static/img/tensorboard/add_hparam.png
2022-08-19T23:27:53.7779545Z 
2022-08-19T23:27:53.8004458Z copying static files... done
2022-08-19T23:27:53.8004811Z copying extra files... done
2022-08-19T23:27:54.2256268Z dumping search index in English (code: en)... done
2022-08-19T23:27:54.3325756Z dumping object inventory... done
2022-08-19T23:27:54.3328394Z build succeeded.
2022-08-19T23:27:54.3329834Z 
2022-08-19T23:27:54.3330142Z The HTML pages are in build/html.
2022-08-20T05:20:53.3704354Z ##[error]The operation was canceled.
2022-08-20T05:20:53.3725451Z Prepare all required actions
2022-08-20T05:20:53.3741750Z ##[group]Run ./.github/actions/chown-workspace
2022-08-20T05:20:53.3741954Z ##[endgroup]
2022-08-20T05:20:53.3754720Z ##[group]Run docker run --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .
2022-08-20T05:20:53.3755048Z �[36;1mdocker run --rm -v "$(pwd)":/v -w /v "${ALPINE_IMAGE}" chown -R "$(id -u):$(id -g)" .�[0m
2022-08-20T05:20:53.3766322Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2022-08-20T05:20:53.3766539Z env:
2022-08-20T05:20:53.3766762Z   ALPINE_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/tool/alpine
2022-08-20T05:20:53.3767004Z ##[endgroup]
2022-08-20T05:20:53.4001039Z Unable to find image '308535385114.dkr.ecr.us-east-1.amazonaws.com/tool/alpine:latest' locally

🚧 1 fixed upstream failure:

These were probably caused by upstream breakages that were already fixed.

Please rebase on the viable/strict branch (expand for instructions)

If your commit is older than viable/strict, run these commands:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase FETCH_HEAD

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

vkuzo added a commit that referenced this pull request Aug 19, 2022
Summary:

#83658 reported that ops
followed by a certain pattern of `view` and `size` ops were not quantized
correctly by FX graph mode quantization.

Before this PR, the "size" op was in the "op shares qparams with input"
category, and the code assumed that the input of this op has the same dtype
as its output. This led to incorrectly propagating the `int` dtype
as the output of whichever op was preceding the `view` op, which in turn
made that op blocklisted from quantization.

The fix is to create a new category of ops which work on different dtypes
of tensors but are not observed.  This PR does so for `size`, and also for
`shape` since it works the same way.

Test plan:

```
python test/test_quantization.py -k test_linear_view_size
```

ghstack-source-id: e94248a
Pull Request resolved: #83784
@vkuzo vkuzo requested review from andrewor14 and jerryzh168 August 19, 2022 23:10
Copy link
Contributor

@jerryzh168 jerryzh168 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about input? do we insert observers for input of these ops?

@vkuzo
Copy link
Contributor Author

vkuzo commented Aug 23, 2022

what about input? do we insert observers for input of these ops?

I don't know why that would be needed today. What are your thoughts?

@vkuzo vkuzo requested a review from jerryzh168 August 23, 2022 20:01
@jerryzh168
Copy link
Contributor

jerryzh168 commented Aug 23, 2022

what about input? do we insert observers for input of these ops?

I don't know why that would be needed today. What are your thoughts?

Not saying input observer are needed, just asking to document the behavior clearly, since previous observation types are talking about the relationship between input and output observer and assumes both are inserted.

I think the question is whether we need to dequantize when user call .size() on a Tensor.
original:
... -> x = conv(...) -> x.reshape(x.size()) -> ...

quantized:
... - x = qconv(...) -> x.reshape(x.dequantize().size()) -> ...

how can we have
... - x = qconv(...) -> x.reshape(x.size()) -> ...

Which version do we have right now and how do we insert observers in the flow?

OUTPUT_SHARE_OBSERVER_WITH_INPUT = 1
# this means the output is never observed
# example: x.shape, x.size
OUTPUT_NOT_OBSERVED = 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if both input and output are not observed maybe we want to rename this to INPUT_OUTPUT_NOT_OBSERVED?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, sure, that sounds good

Copy link
Contributor

@jerryzh168 jerryzh168 Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, can we also document if we will dequantize the Tensor before calling tensor.size/tensor.shape here as well? see #83784 (comment)

prepared_model = prepare_fx(model_fp32, qconfig_mapping, x)
prepared_model(x)
quantized_model = convert_fx(prepared_model)
self.assertTrue(type(quantized_model.linear) == nnq.Linear)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also test how many dequantize op appears in the model as well? I think ideally we should not dequantize the Tensor to get the size

@facebook-github-bot
Copy link
Contributor

/easycla

As part of the transition to the PyTorch Foundation, this project now requires contributions be covered under the new CLA. See #85559 for additional details.

This comment will trigger a new check of this PR. If you are already covered, you will simply see a new "EasyCLA" check that passes. If you are not covered, a bot will leave a new comment with a link to sign.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 3, 2022

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label Dec 3, 2022
@pytorch-bot pytorch-bot bot added the release notes: quantization release notes category label Dec 3, 2022
@github-actions github-actions bot closed this Jan 2, 2023
pytorchmergebot pushed a commit that referenced this pull request Jan 27, 2023
#90001)

**Summary**
This work continues with #83784 by @vkuzo and includes all the changes in that PR.
Quote from #83784:
> Issue #83658 reports that ops followed by a certain pattern of `view` and `size` ops were not quantized correctly by FX graph mode quantization.
Before this PR, the "size" op was in the "op shares qparams with input" category, and the code assumed that the input of this op has the same dtype as its output. This led to incorrectly propagating the `int` dtype as the output of whichever op was preceding the `view` op, which in turn made that op blocklisted from quantization.

> The fix is to create a new category of ops which work on different dtypes of tensors but are not observed. This PR does so for `size`, and also for `shape` since it works the same way.

**Note**: This PR needs #91297 to be landed first otherwise there is a UT failure.

**Test plan**
```
python test/test_quantization.py -k test_linear_size_view
python test/test_quantization.py -k test_linear_shape_view
```

Pull Request resolved: #90001
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
@facebook-github-bot facebook-github-bot deleted the gh/vkuzo/513/head branch June 8, 2023 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants