Skip to content

Commit 678dc44

Browse files
Aapo Kyrolafacebook-github-bot
authored andcommitted
use _sparse_coo_tensor_unsafe in coalesce for speedup (#21214)
Summary: Studied why sparse tensor coalesce was slow: issue #10757. Using nv-prof, and writing a simple benchmark, I determined bulk of the time was used ``kernelTransformReduceInnermostDimIndex``, which is called when sparse tensor is constructed with sparse_coo_tensor when it does sanity check on the minimum and maximum indices. However, we do not need this sanity check because after coalescing the tensor, these min/maxs won't change. On my benchmark with 1 million non-zeros, the runtime of coalesce. was about 10x from 0.52s to 0.005 sec. Pull Request resolved: #21214 Reviewed By: bddppq Differential Revision: D15584338 Pulled By: akyrola fbshipit-source-id: a08378baa018dbd0b45d7aba661fc9aefd3791e0
1 parent 9e5f1db commit 678dc44

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

aten/src/ATen/native/sparse/cuda/SparseCUDATensor.cu

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -143,8 +143,10 @@ SparseTensor coalesce_sparse_cuda(const SparseTensor& self) {
143143
}
144144
}
145145
////////////////////////////////////////////////////////////
146-
147-
SparseTensor dst = ::at::native::sparse_coo_tensor(newIndices, newValues, self.sizes())._coalesced_(true);
146+
// We can use unsafe sparse tensor constructor because the indices do not
147+
// need to be revalidated as we do not add or change indices, just remove
148+
// duplicates.
149+
SparseTensor dst = ::at::native::_sparse_coo_tensor_unsafe(newIndices, newValues, self.sizes())._coalesced_(true);
148150

149151
THCudaCheck(cudaGetLastError());
150152
return dst;

0 commit comments

Comments
 (0)