Skip to content

Commit 0a38a6a

Browse files
Aidyn-Apytorchmergebot
authored andcommitted
[ATen][CUDA][CUBLAS] cublasLtMatmul increase workspace_size (#120925)
According to the [cuBLAS API Reference](https://docs.nvidia.com/cuda/cublas/index.html#cublassetworkspace) the recommended workspace size for Hopper is 32 MiB and for the rest architectures 4 MiB. This PR increases the workspace size accordingly. I am not aware of the recommended workspace size for HIP, that is why I am keeping it unchanged. Pull Request resolved: #120925 Approved by: https://github.com/eqy, https://github.com/malfet
1 parent 06b52dd commit 0a38a6a

File tree

1 file changed

+8
-1
lines changed

1 file changed

+8
-1
lines changed

aten/src/ATen/cuda/CUDABlas.cpp

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -183,13 +183,20 @@ uint32_t _getAlignment(uintptr_t address) {
183183

184184
static size_t _parseChosenWorkspaceSize() {
185185
const char * val = getenv("CUBLASLT_WORKSPACE_SIZE");
186+
size_t workspace_size = 1024;
186187
#ifdef USE_ROCM
187188
if (!val) {
188189
// accept either env var
189190
val = getenv("HIPBLASLT_WORKSPACE_SIZE");
190191
}
192+
#else
193+
cudaDeviceProp* p = at::cuda::getDeviceProperties(c10::cuda::current_device());
194+
if (p->major == 8) {
195+
workspace_size = 4096;
196+
} else if (p->major >= 9) {
197+
workspace_size = 32768;
198+
}
191199
#endif
192-
size_t workspace_size = 1024; /* default size in KiB according to #73328 */
193200
if (val) {
194201
try {
195202
workspace_size = std::stoi(val);

0 commit comments

Comments
 (0)