use _sparse_coo_tensor_unsafe in coalesce for speedup (#21214)

Aapo Kyrola · facebook-github-bot · commit 678dc44d4c15 · 2019-05-31T17:10:05.000-07:00
Summary: Studied why sparse tensor coalesce was slow: issue #10757. Using nv-prof, and writing a simple benchmark, I determined bulk of the time was used ``kernelTransformReduceInnermostDimIndex``, which is called when sparse tensor is constructed with sparse_coo_tensor when it does sanity check on the minimum and maximum indices. However, we do not need this sanity check because after coalescing the tensor, these min/maxs won't change. On my benchmark with 1 million non-zeros, the runtime of coalesce. was about 10x from 0.52s to 0.005 sec. Pull Request resolved: #21214 Reviewed By: bddppq Differential Revision: D15584338 Pulled By: akyrola fbshipit-source-id: a08378baa018dbd0b45d7aba661fc9aefd3791e0
diff --git a/aten/src/ATen/native/sparse/cuda/SparseCUDATensor.cu b/aten/src/ATen/native/sparse/cuda/SparseCUDATensor.cu
@@ -143,8 +143,10 @@ SparseTensor coalesce_sparse_cuda(const SparseTensor& self) {
     }
   }
   ////////////////////////////////////////////////////////////
-
-  SparseTensor dst = ::at::native::sparse_coo_tensor(newIndices, newValues, self.sizes())._coalesced_(true);
+  // We can use unsafe sparse tensor constructor because the indices do not
+  // need to be revalidated as we do not add or change indices, just remove
+  // duplicates.
+  SparseTensor dst = ::at::native::_sparse_coo_tensor_unsafe(newIndices, newValues, self.sizes())._coalesced_(true);
 
   THCudaCheck(cudaGetLastError());
   return dst;

Original file line number	Diff line number	Diff line change
`@@ -143,8 +143,10 @@ SparseTensor coalesce_sparse_cuda(const SparseTensor& self) {`
`143`	`143`	`}`
`144`	`144`	`}`
`145`	`145`	`////////////////////////////////////////////////////////////`
`146`		`-`
`147`		`- SparseTensor dst = ::at::native::sparse_coo_tensor(newIndices, newValues, self.sizes())._coalesced_(true);`
	`146`	`+ // We can use unsafe sparse tensor constructor because the indices do not`
	`147`	`+ // need to be revalidated as we do not add or change indices, just remove`
	`148`	`+ // duplicates.`
	`149`	`+ SparseTensor dst = ::at::native::_sparse_coo_tensor_unsafe(newIndices, newValues, self.sizes())._coalesced_(true);`
`148`	`150`
`149`	`151`	`THCudaCheck(cudaGetLastError());`
`150`	`152`	`return dst;`