Skip to content

Conversation

@leslie-fang-intel
Copy link
Collaborator

@leslie-fang-intel leslie-fang-intel commented Dec 7, 2022

Stack from ghstack (oldest at bottom):

Summary
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused conv2d_add_relu op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

Test Plan

python -m pytest test_quantization.py::TestQuantizedConv

cc @VitalyFedyunin @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

@pytorch-bot
Copy link

pytorch-bot bot commented Dec 7, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90364

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit e4877f1:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: quantization release notes category label Dec 7, 2022
@github-actions github-actions bot added the module: cpu CPU specific problem (e.g., perf, algorithm) label Dec 7, 2022
leslie-fang-intel added a commit that referenced this pull request Dec 7, 2022
@leslie-fang-intel leslie-fang-intel marked this pull request as draft December 7, 2022 07:36
@leslie-fang-intel leslie-fang-intel added intel This tag is for PR from Intel ciflow/trunk Trigger trunk jobs on your pull request labels Dec 7, 2022
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused conv2d_add_relu op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

**TODO**
There is a oneDNN issue which may cause the kernel core dump with some input shapes. This PR should be merged after the issue resolved.

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused conv2d_add_relu op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

**TODO**
There is a oneDNN issue which may cause the kernel core dump with some input shapes. This PR should be merged after the issue resolved.

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
leslie-fang-intel added a commit that referenced this pull request Dec 10, 2022
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused conv2d_add_relu op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

**TODO**
There is a oneDNN issue which may cause the kernel core dump with some input shapes. This PR should be merged after the issue resolved.

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused conv2d_add_relu op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

**TODO**
There is a oneDNN issue which may cause the kernel core dump with some input shapes. This PR should be merged after the issue resolved.

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused conv2d_add_relu op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused conv2d_add_relu op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
@leslie-fang-intel leslie-fang-intel requested review from jerryzh168 and removed request for z-a-f January 10, 2023 02:38
@leslie-fang-intel
Copy link
Collaborator Author

@jerryzh168 Thanks for the comments and I have changed them. Could you help to take a look of this PR again?

leslie-fang-intel added a commit to leslie-fang-intel/pytorch that referenced this pull request Jan 13, 2023
leslie-fang-intel added a commit to leslie-fang-intel/pytorch that referenced this pull request Jan 26, 2023
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused conv2d_add_relu op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
@leslie-fang-intel
Copy link
Collaborator Author

Hi @jerryzh168 Is there any other comments to this PR? Could you help to take a look again?

**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused conv2d_add_relu op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused conv2d_add_relu op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
Y_scale, Y_zero_point, use_bias, "add", use_channelwise, False,
input_dtype=X_qdtype, output_dtype=X_qdtype, X2_scale=X2_scale, X2_zero_point=X2_zero_point)

@given(batch_size=st.integers(1, 3),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't add new calls to hypothesis, it has caused a lot of flaky test errors in CI before, can you change them to loops instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments. I have changed it with itertools.product to use for loop. Please help to take a look again @jerryzh168.

**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused conv2d_add_relu op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused conv2d_add_relu op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
**Summary**
Post op fusion can reduce data movement overhead and improve inference performance. This PR adds fused conv2d_add_relu op for onednn backend, which will be used for int8 inference with onednn backend. Cannot call this op with other quantization backends otherwise an error is thrown.

**Test Plan**
```
python -m pytest test_quantization.py::TestQuantizedConv
```

cc VitalyFedyunin jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
@leslie-fang-intel
Copy link
Collaborator Author

Hi @jerryzh168, thanks your review and comments for this ghstack. The previous comments have all been fixed. Could you kindly take a look of this ghstack again? There are still some PRs inside this ghstack may need your approval.

@leslie-fang-intel
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@vkuzo
Copy link
Contributor

vkuzo commented Feb 7, 2023

I have a build error blamed to this PR when trying to build PyTorch on my machine: https://gist.github.com/vkuzo/43fa00f625fa099eb50b1e41bf8a9b2e, specifically the residual_with_sum_zero_point function. Any thoughts on how I can debug this?

@jerryzh168
Copy link
Contributor

is this related to the version of ideep dependency?

@leslie-fang-intel
Copy link
Collaborator Author

@vkuzo Can you try update your ideep version to same as PyTorch master? residual_with_sum_zero_point is newly added to ideep and this ideep version has been integrated to PyTorch.

@vkuzo
Copy link
Contributor

vkuzo commented Feb 8, 2023

thanks, my ideep third_party repo was corrupted, removing and recloning it fixed the error. Sorry for false report!

@facebook-github-bot facebook-github-bot deleted the gh/leslie-fang-intel/5/head branch June 8, 2023 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request intel This tag is for PR from Intel Merged module: cpu CPU specific problem (e.g., perf, algorithm) open source release notes: quantization release notes category

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

7 participants