-
Notifications
You must be signed in to change notification settings - Fork 26.3k
throw RuntimeErrors for mkldnn modules when mkldnn is disabled #40423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
torch/utils/mkldnn.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is bringing us back to mkldnn not being called with dilation. Should this PR be landed on top of the one that allows convolutions with dilation to use mkldnn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enable it at #40483
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is convolution behavior wrt mkl or dense inputs? Similar to cases you posted above for linear and bn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After #40610 torch.convolution won't be calling mkldnn in a fair number of cases, so you still need to call mkldnn_convolution here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ngimel, I don't think need to call mkldnn_convolution here, if input and weight are MKLDNN tensor, it will call mkldnn_convolution directly, see
pytorch/aten/src/ATen/native/Convolution.cpp
Lines 230 to 239 in fcadca1
| auto ConvParams::use_mkldnn(const at::Tensor& input) const -> bool { | |
| #if AT_MKLDNN_ENABLED() | |
| if (!at::globalContext().userEnabledMkldnn()) { | |
| return false; | |
| } | |
| return (input.is_mkldnn()) || // input is mkldnn Tensor | |
| (input.options().backend() == at::Backend::CPU && | |
| input.scalar_type() == kFloat && // only on CPU Float Tensors | |
| !transposed && // or transposed tensors | |
| input.ndimension() == 4); // must be in NCHW format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aten/src/ATen/native/Convolution.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it enough to check only for mkldnn inputs? Currently the following (which, as far as I understand, sends regular tensor to mkldnn convolution) works:
x = torch.randn(N, C, 56, 56, dtype=torch.float)#.to_mkldnn()
conv2d = torch.nn.Conv2d(in_channels=C,
out_channels=M,
kernel_size=3,
padding=1)
mkldnn_conv2d = mkldnn_utils.to_mkldnn(copy.deepcopy(conv2d))
with torch.backends.mkldnn.flags(enabled=True):
y_mkldnn = mkldnn_conv2d(x)#.to_dense()
Same question for all other functions - you added checks only for mkldnn inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ngimel , for convolution, this input check can be removed, see
- torch.backends.mkldnn.flags(enabled=True), input and weight are MKLNDN tensor, it is ok.
- torch.backends.mkldnn.flags(enabled=True), input is dense tensor, and weight are MKLNDN tensor, will call MKLDNN path, there will make check to make sure input and weight have same type: see
, so this will report a error: Input type (torch.FloatTensor) and weight type (Mkldnntorch.FloatTensor) should be the same.pytorch/aten/src/ATen/native/Convolution.cpp
Lines 746 to 748 in 6468bc4
TORCH_CHECK(input.options().type_equal(weight.options()), "Input type (", input.toString(), ") and weight type (", weight.toString(), ") should be the same"); - torch.backends.mkldnn.flags(enabled=False), input and weight are MKLNDN tensor, will call this path
, which not support MKLDNN inputs. So this will report a error: "opaque tensors do not have is_contiguous".pytorch/aten/src/ATen/native/Convolution.cpp
Lines 801 to 802 in 6468bc4
output = at::_convolution_nogroup( input.contiguous(), weight, bias, params.stride, params.padding, params.dilation, params.transposed, params.output_padding); - torch.backends.mkldnn.flags(enabled=False), input is dense tensor, weight are MKLNDN tensor, also call path
pytorch/aten/src/ATen/native/Convolution.cpp
Lines 801 to 802 in 6468bc4
output = at::_convolution_nogroup( input.contiguous(), weight, bias, params.stride, params.padding, params.dilation, params.transposed, params.output_padding);
this will report a error: Cannot access data pointer of Tensor that doesn't have storage.
aten/src/ATen/native/Linear.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For linear:
x = torch.randn(56, 56, dtype=torch.float)
linear = torch.nn.Linear(56, 56 bias=True)
mkldnn_linear = mkldnn_utils.to_mkldnn(copy.deepcopy(linear))
with torch.backends.mkldnn.flags(enabled=True):
y_mkldnn = mkldnn_linear(x)
- torch.backends.mkldnn.flags(enabled=True), input and weight are MKLDNN tensor, it can works.
- torch.backends.mkldnn.flags(enabled=True), input is dense tensor, weight are MKLDNN tensor, it can works, because we will first transfer input to MLDNN tensor in .
Line 30 in bdc0019
x_mkldnn = x if x.is_mkldnn else x.to_mkldnn() - torch.backends.mkldnn.flags(enabled=False), input and weight are MKLDNN tensor, it can report error: MKLDNN linear module can't run when mkldnn disabled
- torch.backends.mkldnn.flags(enabled=False), input is dense tensor, weight are MKLDNN tensor, it can also report error: MKLDNN linear module can't run when mkldnn disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For BatchNorm,
x = torch.randn(1, 3, 56, 56, dtype=torch.float)
bn = torch.nn.BatchNorm2d(3).float().train(False)
mkldnn_bn = mkldnn_utils.to_mkldnn(copy.deepcopy(bn))
with torch.backends.mkldnn.flags(enabled=True):
y_mkldnn = bn(x)
- torch.backends.mkldnn.flags(enabled=True), input and weight are MKLDNN tensor, it can works.
- torch.backends.mkldnn.flags(enabled=True), input is dense tensor, weight are MKLDNN tensor, it report a error:mkldnn_tensor.is_mkldnn() INTERNAL ASSERT FAILED at "../aten/src/ATen/native/mkldnn/MKLDNNCommon.cpp":56, please report a bug to PyTorch. mkldnn_to_dense expects MKL-DNN tensor input.
- torch.backends.mkldnn.flags(enabled=False), input and weight are MKLDNN tensor, it will report a error:MKLDNN batch_norm module can't run when mkldnn is disabled.
- torch.backends.mkldnn.flags(enabled=False), input is dense tensor, weight are MKLDNN tensor, it will report a error:mkldnn_tensor.is_mkldnn() INTERNAL ASSERT FAILED at "../aten/src/ATen/native/mkldnn/MKLDNNCommon.cpp":56, please report a bug to PyTorch. mkldnn_to_dense expects MKL-DNN tensor input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should not be INTERNAL_ASSERTs here, right? Is it a real pytorch bug that you guys have to fix, or should the error message be different. Couple additional questions
- why is mkldnn_to_dense attempted here, is it when converting output of the operation to dense tensor?
- Why is behavior here different than in linear, where case 2 works, and case 4 errors out as expected, with expected error message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the first one, it call in
| ideep::tensor& itensor_from_mkldnn(const MKLDNNTensor& mkldnn_tensor) { |
For the second: the case 2 can be works for linear is that we always convert dense tensor to MKLDNN tensor for input, see
Line 30 in 7b0f867
| x_mkldnn = x if x.is_mkldnn else x.to_mkldnn() |
|
Hi @XiaobingSuper! Thank you for your pull request. We require contributors to sign our Contributor License Agreement, and yours needs attention. You currently have a record in our system, but we do not have a signature on file. In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks! |
fix #40391, we will throw error for mkldnn module when mkldnn disabled.