Skip to content

Conversation

@leslie-fang-intel
Copy link
Collaborator

@leslie-fang-intel leslie-fang-intel commented Dec 14, 2022

Stack from ghstack (oldest at bottom):

Summary
Previously, we use DNNL_RUNTIME_S32_VAL as the zero point for src in both weight prepack and convolution forward to ensure the same block format of weight is used. The problem is DNNL_RUNTIME_S32_VAL may query out a different block format weight comparing with the true zero point for src. It makes oneDNN convolution into jit path instead of brgconv path. Here we will use the true zero point for src to create pd and make reorder if it's a different block format weight as weight prepack generated.

Test Plan

python -m pytest quantization/core/test_quantized_op.py::TestQuantizedConv::test_conv_transpose_reorder_issue_onednn

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

@pytorch-bot
Copy link

pytorch-bot bot commented Dec 14, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90818

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 0656a6e:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions github-actions bot added the module: cpu CPU specific problem (e.g., perf, algorithm) label Dec 14, 2022
@leslie-fang-intel leslie-fang-intel marked this pull request as draft December 14, 2022 03:58
@leslie-fang-intel leslie-fang-intel added intel This tag is for PR from Intel ciflow/trunk Trigger trunk jobs on your pull request labels Dec 14, 2022
@leslie-fang-intel leslie-fang-intel changed the title [Quant] Use the true src zero point to query and create pd [Quant] Use the true src zero point to query and create conv pd Dec 14, 2022
…nv pd"


**Summary**
Previously, we use `DNNL_RUNTIME_S32_VAL` as the `zero point` for `src` in both weight prepack and convolution forward to ensure the same block format of weight is used. The problem is `DNNL_RUNTIME_S32_VAL` may query out a different block format weight comparing with the true `zero point` for `src`. It makes oneDNN convolution into `jit` path instead of `brgconv` path. Here we will use the true `zero point` for `src` to create pd and make reorder if it's a different block format weight as weight prepack generated.  

**Test Plan**
```
python -m pytest quantization/core/test_quantized_op.py::TestQuantizedConv::test_conv_reorder_issue_onednn
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
@leslie-fang-intel
Copy link
Collaborator Author

@Xia-Weiwen Please help to take a look.

leslie-fang-intel added a commit that referenced this pull request Dec 14, 2022
…nv pd"


**Summary**
Previously, we use `DNNL_RUNTIME_S32_VAL` as the `zero point` for `src` in both weight prepack and convolution forward to ensure the same block format of weight is used. The problem is `DNNL_RUNTIME_S32_VAL` may query out a different block format weight comparing with the true `zero point` for `src`. It makes oneDNN convolution into `jit` path instead of `brgconv` path. Here we will use the true `zero point` for `src` to create pd and make reorder if it's a different block format weight as weight prepack generated.  

**Test Plan**
```
python -m pytest quantization/core/test_quantized_op.py::TestQuantizedConv::test_conv_reorder_issue_onednn
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
leslie-fang-intel added a commit that referenced this pull request Dec 14, 2022
}
// Since src zero point is unknown, set runtime value here
op_attr.set_zero_points(DNNL_ARG_SRC, ideep::utils::tensor_zp_mask(1), {DNNL_RUNTIME_S32_VAL});
op_attr.set_zero_points(DNNL_ARG_SRC, ideep::utils::tensor_zp_mask(1), src_zero_points);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we use src zero point to initialize the primitive in the cache or use DNNL_RUNTIME_S32_VAL instead? If it is the latter, the primitive cache won't take effect any longer?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we use src zero point to initialize the primitive in the cache or use DNNL_RUNTIME_S32_VAL instead? If it is the latter, the primitive cache won't take effect any longer?

We used to use runtime value to create primitive and put it in cache. This PR uses true values of src zero point to create a primitive.
Src zero point is part of the cache key. If src zero point changes, cache misses. It has always been this behavior, not introduced by this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. If we were using DNNL_RUNTIME_S32_VAL previously, the src zero point shouldn't be part of the key? Anyway, the change looks good then.

Copy link
Collaborator

@Xia-Weiwen Xia-Weiwen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
BTW, try_reorder is no longer needed after this PR #90354 is landed which uses new ideep API. The try_reorder behavior is implemented inside ideep.

…nv pd"


**Summary**
Previously, we use `DNNL_RUNTIME_S32_VAL` as the `zero point` for `src` in both weight prepack and convolution forward to ensure the same block format of weight is used. The problem is `DNNL_RUNTIME_S32_VAL` may query out a different block format weight comparing with the true `zero point` for `src`. It makes oneDNN convolution into `jit` path instead of `brgconv` path. Here we will use the true `zero point` for `src` to create pd and make reorder if it's a different block format weight as weight prepack generated.  

**Test Plan**
```
python -m pytest quantization/core/test_quantized_op.py::TestQuantizedConv::test_conv_reorder_issue_onednn
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
…nv pd"


**Summary**
Previously, we use `DNNL_RUNTIME_S32_VAL` as the `zero point` for `src` in both weight prepack and convolution forward to ensure the same block format of weight is used. The problem is `DNNL_RUNTIME_S32_VAL` may query out a different block format weight comparing with the true `zero point` for `src`. It makes oneDNN convolution into `jit` path instead of `brgconv` path. Here we will use the true `zero point` for `src` to create pd and make reorder if it's a different block format weight as weight prepack generated.  

**Test Plan**
```
python -m pytest quantization/core/test_quantized_op.py::TestQuantizedConv::test_conv_reorder_issue_onednn
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
…nv pd"


**Summary**
Previously, we use `DNNL_RUNTIME_S32_VAL` as the `zero point` for `src` in both weight prepack and convolution forward to ensure the same block format of weight is used. The problem is `DNNL_RUNTIME_S32_VAL` may query out a different block format weight comparing with the true `zero point` for `src`. It makes oneDNN convolution into `jit` path instead of `brgconv` path. Here we will use the true `zero point` for `src` to create pd and make reorder if it's a different block format weight as weight prepack generated.  

**Test Plan**
```
python -m pytest quantization/core/test_quantized_op.py::TestQuantizedConv::test_conv_reorder_issue_onednn
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
…nv pd"


**Summary**
Previously, we use `DNNL_RUNTIME_S32_VAL` as the `zero point` for `src` in both weight prepack and convolution forward to ensure the same block format of weight is used. The problem is `DNNL_RUNTIME_S32_VAL` may query out a different block format weight comparing with the true `zero point` for `src`. It makes oneDNN convolution into `jit` path instead of `brgconv` path. Here we will use the true `zero point` for `src` to create pd and make reorder if it's a different block format weight as weight prepack generated.  

**Test Plan**
```
python -m pytest quantization/core/test_quantized_op.py::TestQuantizedConv::test_conv_reorder_issue_onednn
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
qx = torch.quantize_per_tensor(x, scale=1.0, zero_point=0, dtype=torch.quint8)
# The following should pass when input shape is changed
torch.ops.quantized.conv2d(qx, w_packed, output_scale=1.0, output_zero_point=0)
# conv_transposed part
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe split this into a separate test?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment. Put it into a separate test.

…nv pd"


**Summary**
Previously, we use `DNNL_RUNTIME_S32_VAL` as the `zero point` for `src` in both weight prepack and convolution forward to ensure the same block format of weight is used. The problem is `DNNL_RUNTIME_S32_VAL` may query out a different block format weight comparing with the true `zero point` for `src`. It makes oneDNN convolution into `jit` path instead of `brgconv` path. Here we will use the true `zero point` for `src` to create pd and make reorder if it's a different block format weight as weight prepack generated.  

**Test Plan**
```
python -m pytest quantization/core/test_quantized_op.py::TestQuantizedConv::test_conv_reorder_issue_onednn
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
…nv pd"


**Summary**
Previously, we use `DNNL_RUNTIME_S32_VAL` as the `zero point` for `src` in both weight prepack and convolution forward to ensure the same block format of weight is used. The problem is `DNNL_RUNTIME_S32_VAL` may query out a different block format weight comparing with the true `zero point` for `src`. It makes oneDNN convolution into `jit` path instead of `brgconv` path. Here we will use the true `zero point` for `src` to create pd and make reorder if it's a different block format weight as weight prepack generated.  

**Test Plan**
```
python -m pytest quantization/core/test_quantized_op.py::TestQuantizedConv::test_conv_reorder_issue_onednn
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
…nv pd"


**Summary**
Previously, we use `DNNL_RUNTIME_S32_VAL` as the `zero point` for `src` in both weight prepack and convolution forward to ensure the same block format of weight is used. The problem is `DNNL_RUNTIME_S32_VAL` may query out a different block format weight comparing with the true `zero point` for `src`. It makes oneDNN convolution into `jit` path instead of `brgconv` path. Here we will use the true `zero point` for `src` to create pd and make reorder if it's a different block format weight as weight prepack generated.  

**Test Plan**
```
python -m pytest quantization/core/test_quantized_op.py::TestQuantizedConv::test_conv_reorder_issue_onednn
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
@leslie-fang-intel leslie-fang-intel requested review from jerryzh168 and removed request for z-a-f January 10, 2023 02:52
@leslie-fang-intel
Copy link
Collaborator Author

@jerryzh168 I have changed according to your comments. Could you help to take a look of this PR again?

leslie-fang-intel added a commit to leslie-fang-intel/pytorch that referenced this pull request Jan 13, 2023
leslie-fang-intel added a commit to leslie-fang-intel/pytorch that referenced this pull request Jan 26, 2023
…nv pd"


**Summary**
Previously, we use `DNNL_RUNTIME_S32_VAL` as the `zero point` for `src` in both weight prepack and convolution forward to ensure the same block format of weight is used. The problem is `DNNL_RUNTIME_S32_VAL` may query out a different block format weight comparing with the true `zero point` for `src`. It makes oneDNN convolution into `jit` path instead of `brgconv` path. Here we will use the true `zero point` for `src` to create pd and make reorder if it's a different block format weight as weight prepack generated.  

**Test Plan**
```
python -m pytest quantization/core/test_quantized_op.py::TestQuantizedConv::test_conv_reorder_issue_onednn
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
@leslie-fang-intel
Copy link
Collaborator Author

Hi @jerryzh168 Is there any other comments to this PR? Could you help to take a look again?

…nv pd"


**Summary**
Previously, we use `DNNL_RUNTIME_S32_VAL` as the `zero point` for `src` in both weight prepack and convolution forward to ensure the same block format of weight is used. The problem is `DNNL_RUNTIME_S32_VAL` may query out a different block format weight comparing with the true `zero point` for `src`. It makes oneDNN convolution into `jit` path instead of `brgconv` path. Here we will use the true `zero point` for `src` to create pd and make reorder if it's a different block format weight as weight prepack generated.  

**Test Plan**
```
python -m pytest quantization/core/test_quantized_op.py::TestQuantizedConv::test_conv_reorder_issue_onednn
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
…nv pd"


**Summary**
Previously, we use `DNNL_RUNTIME_S32_VAL` as the `zero point` for `src` in both weight prepack and convolution forward to ensure the same block format of weight is used. The problem is `DNNL_RUNTIME_S32_VAL` may query out a different block format weight comparing with the true `zero point` for `src`. It makes oneDNN convolution into `jit` path instead of `brgconv` path. Here we will use the true `zero point` for `src` to create pd and make reorder if it's a different block format weight as weight prepack generated.  

**Test Plan**
```
python -m pytest quantization/core/test_quantized_op.py::TestQuantizedConv::test_conv_reorder_issue_onednn
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
Comment on lines 6237 to 6238
if 'onednn' not in supported_qengines:
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestions. Changed it.

…nv pd"


**Summary**
Previously, we use `DNNL_RUNTIME_S32_VAL` as the `zero point` for `src` in both weight prepack and convolution forward to ensure the same block format of weight is used. The problem is `DNNL_RUNTIME_S32_VAL` may query out a different block format weight comparing with the true `zero point` for `src`. It makes oneDNN convolution into `jit` path instead of `brgconv` path. Here we will use the true `zero point` for `src` to create pd and make reorder if it's a different block format weight as weight prepack generated.  

**Test Plan**
```
python -m pytest quantization/core/test_quantized_op.py::TestQuantizedConv::test_conv_reorder_issue_onednn
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
…nv pd"


**Summary**
Previously, we use `DNNL_RUNTIME_S32_VAL` as the `zero point` for `src` in both weight prepack and convolution forward to ensure the same block format of weight is used. The problem is `DNNL_RUNTIME_S32_VAL` may query out a different block format weight comparing with the true `zero point` for `src`. It makes oneDNN convolution into `jit` path instead of `brgconv` path. Here we will use the true `zero point` for `src` to create pd and make reorder if it's a different block format weight as weight prepack generated.  

**Test Plan**
```
python -m pytest quantization/core/test_quantized_op.py::TestQuantizedConv::test_conv_reorder_issue_onednn
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
…nv pd"


**Summary**
Previously, we use `DNNL_RUNTIME_S32_VAL` as the `zero point` for `src` in both weight prepack and convolution forward to ensure the same block format of weight is used. The problem is `DNNL_RUNTIME_S32_VAL` may query out a different block format weight comparing with the true `zero point` for `src`. It makes oneDNN convolution into `jit` path instead of `brgconv` path. Here we will use the true `zero point` for `src` to create pd and make reorder if it's a different block format weight as weight prepack generated.  

**Test Plan**
```
python -m pytest quantization/core/test_quantized_op.py::TestQuantizedConv::test_conv_reorder_issue_onednn
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
@leslie-fang-intel
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@facebook-github-bot facebook-github-bot deleted the gh/leslie-fang-intel/12/head branch June 8, 2023 17:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request intel This tag is for PR from Intel Merged module: cpu CPU specific problem (e.g., perf, algorithm) open source release notes: quantization release notes category

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

7 participants