Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions aten/src/ATen/native/TensorShape.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -624,6 +624,10 @@ Tensor & transpose_(Tensor & self, int64_t dim0, int64_t dim1) {
return sparse_transpose_(self, dim0, dim1);
}

if (self.is_mkldnn()) {
return at::mkldnn_transpose_(self, dim0, dim1);
}

auto strides = self.strides().vec();
auto sizes = self.sizes().vec();
std::swap(strides[dim0], strides[dim1]);
Expand All @@ -644,6 +648,10 @@ Tensor transpose(const Tensor & self, int64_t dim0, int64_t dim1) {
return sparse_transpose_(self_clone, dim0, dim1);
}

if (self.is_mkldnn()) {
return at::mkldnn_transpose(self, dim0, dim1);
}

auto strides = self.strides().vec();
auto sizes = self.sizes().vec();
std::swap(strides[dim0], strides[dim1]);
Expand Down
30 changes: 27 additions & 3 deletions aten/src/ATen/native/mkldnn/TensorShape.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,14 @@ Tensor mkldnn_clone(const Tensor& self) {
AT_ERROR("mkldnn_clone: ATen not compiled with MKLDNN support");
}

Tensor mkldnn_transpose(const Tensor& self, int64_t dim0, int64_t dim1) {
AT_ERROR("mkldnn_transpose: ATen not compiled with MKLDNN support");
}

Tensor& mkldnn_transpose_(Tensor& self, int64_t dim0, int64_t dim1) {
AT_ERROR("mkldnn_transpose_: ATen not compiled with MKLDNN support");
}

} // namespace native
} // namespace at

Expand All @@ -37,10 +45,12 @@ Tensor mkldnn_view(const Tensor& self, IntArrayRef size) {

Tensor mkldnn_reshape(const Tensor& self, IntArrayRef size) {
auto inferred_size = at::infer_size(size, self.numel());
if (self.sizes() == inferred_size) {
return self;
}
const ideep::tensor& x = itensor_from_mkldnn(self);
ideep::tensor y;
ideep::direct_copy::compute<AllocForMKLDNN>(x, y);
y.reshape({inferred_size.cbegin(), inferred_size.cend()});
ideep::tensor y{x};
y.reshape<AllocForMKLDNN>({inferred_size.cbegin(), inferred_size.cend()});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm if new shapes are as same as the old shapes, ideep::tensor::reshape will not create a copy and thus y will share the same memory as x. This could be messy since then there will be two aten opaque tensors sharing the same underlying ideep tensor.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bddppq We wanted to keep the semantics of reshape as much as possible here, i.e. do shallow copy on plain-format MKL-DNN tensor (layout of which is contiguous) and do deep copy otherwise (see https://github.com/intel/ideep/blob/d7304e0345c3f0647cb020a71682d680b9f0424a/include/ideep/tensor.hpp#L1100). For the shallow copy case, ideep::tensor y{x} would make y copy the metadata of x while share its underlying storage. This is the same semantics as the shallow copy of a native CPU tensor. So this should work fine here.

Copy link
Contributor

@bddppq bddppq Jun 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jgong5 It's fine to follow the semantics of cpu tensor to return the input tensor as output if the shapes are the same, but it's not fine to let two different aten opaque tensors to hold the same underlying ideep tensor.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bddppq But y and x are two different ideep::tensor object, right? They just share the same underlying storage while hold their own metadata. It is similar to how two CPU tensors share the same underlying storage while have their own metatdata. Make sense?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bddppq @XiaobingSuper OK. Sounds like we should call y.set_descriptor(x.get_descriptor()) explicitly after ideep::tensor y{x} in order to guarantee y uses a new MKL-DNN descriptor while sharing the underlying buffer of x. We should have wrapped an explicit shallow copy API in ideep to do such a thing actually.

Copy link
Contributor

@bddppq bddppq Jun 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jgong5 @XiaobingSuper I would say just add the following lines at the beginning of this function :-)

if (self.sizes() == inferred_size) {
  return self;
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will change it. Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bddppq, @jgong5 , changed it, thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bddppq y.reshape would create a new descriptor by calling set_descriptor() so additional y.set_descriptor(x.get_descriptor()) is not necessary here. With the recent change of @XiaobingSuper , the behavior of mkldnn_reshape looks like below, mimicking the semantics of reshape of native CPU tensor:

  1. If the requested shape is the same as self, return self directly.
  2. If the format of MKL-DNN tensor is public (or plain), the storage is contiguous, the returned MKL-DNN tensor shares the storage of self and copies all metadata of self with the new shape.
  3. For other situations, return a deep-copied tensor of self with the new shape.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jgong5 @XiaobingSuper This sounds good to me.

return new_with_itensor_mkldnn(std::move(y), self.options());
}

Expand All @@ -51,6 +61,20 @@ Tensor mkldnn_clone(const Tensor& self) {
return new_with_itensor_mkldnn(std::move(dst), self.options());
}

Tensor mkldnn_transpose(const Tensor & self, int64_t dim0, int64_t dim1) {
const ideep::tensor& x = itensor_from_mkldnn(self);
ideep::tensor y;
std::vector<int> axes(x.ndims());
std::iota(axes.begin(), axes.end(), 0);
std::swap(axes[dim0], axes[dim1]);
y.transpose_from<AllocForMKLDNN>(x, axes);
return new_with_itensor_mkldnn(std::move(y), self.options());
}

Tensor& mkldnn_transpose_(Tensor& self, int64_t dim0, int64_t dim1) {
AT_ERROR("mkldnn_transpose_: in-place mkldnn operations are not supported yet");
}

} // namespace native
} // namespace at

Expand Down
12 changes: 12 additions & 0 deletions aten/src/ATen/native/native_functions.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1987,10 +1987,22 @@
variants: function, method
device_guard: False

- func: mkldnn_transpose(Tensor self, int dim0, int dim1) -> Tensor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this part of the public API? If you want to have helper functions prepend an underscore to their name!

device_guard: False
requires_tensor: True
dispatch:
MkldnnCPU: mkldnn_transpose

- func: transpose_(Tensor(a!) self, int dim0, int dim1) -> Tensor(a!)
variants: method
device_guard: False

- func: mkldnn_transpose_(Tensor(a!) self, int dim0, int dim1) -> Tensor(a!)
device_guard: False
requires_tensor: True
dispatch:
MkldnnCPU: mkldnn_transpose_

- func: one_hot(Tensor self, int num_classes=-1) -> Tensor
python_module: nn
variants: function
Expand Down
16 changes: 16 additions & 0 deletions test/test_mkldnn.py
Original file line number Diff line number Diff line change
Expand Up @@ -279,6 +279,13 @@ def test_reshape(self):
x.reshape(size),
x.to_mkldnn().reshape(size).to_dense(),
)
# test whether share same memory for plain format tensor
y = x.to_mkldnn()
z = y.reshape(size).add_(y.reshape(size))
self.assertEqual(
y.reshape(size).to_dense(),
z.to_dense(),
)

def test_clone(self):
x = torch.randn(4, 5, dtype=torch.float32) * 10
Expand All @@ -294,6 +301,15 @@ def test_clone(self):
z.to_dense(),
)

def test_transpose(self):
x = torch.randn(3, 4, 5, dtype=torch.float32) * 10
for dim1 in range(x.ndim):
for dim2 in range(x.ndim):
self.assertEqual(
x.transpose(dim1, dim2),
x.to_mkldnn().transpose(dim1, dim2).to_dense(),
)

def test_linear(self):
in_features = torch.randint(3, 10, (1,)).item()
out_features = torch.randint(3, 100, (1,)).item()
Expand Down