Skip to content

Conversation

@sliwowitz
Copy link
Contributor

In cp_fm_geeig, instead of reducing the generalized eigenvalue problem into standard from on the CPU and then solving the standard form on the GPU using cusolverMpSyevd, solve the whole generalized problem on the GPU using cusolverMpSygvd. This gives a nice speedup for all problems which are large enough to benefit from using cuSOLVERMp in the original code.

The original API for solving only the standard form on the GPU has been kept in the code, should there be a need to use it from some other routine.

@sliwowitz sliwowitz force-pushed the cusolvermp branch 3 times, most recently from c900392 to 73864ca Compare November 27, 2024 11:09
@sliwowitz
Copy link
Contributor Author

The automatic checks passed, but I suppose none of test are actually testing a build + run with cuSOLVERMp enabled, right?

For the record, using the fixes in PR #3770, I'm building CP2K with the Nvidia HPC SDK like cmake -DCMAKE_PREFIX_PATH="/home/vyskoc65/cp2k/lib/dbcsr;/opt/nvidia/hpc_sdk/Linux_x86_64/24.7/compilers/bin;/opt/nvidia/hpc_sdk/Linux_x86_64/24.7/math_libs;/opt/nvidia/hpc_sdk/Linux_x86_64/24.7/comm_libs/12.6/hpcx/latest/ucc;/opt/nvidia/hpc_sdk/Linux_x86_64/24.7/comm_libs/12.6/hpcx/latest/ucx" -DCP2K_USE_ACCEL="CUDA" -DCP2K_USE_CUSOLVER_MP=ON ..

@oschuett
Copy link
Member

oschuett commented Nov 27, 2024

... none of test are actually testing a build + run with cuSOLVERMp enabled, right?

That's correct! There were a couple of reasons:

  • cuSOLVERMp is not included in the CUDA docker image that we're using and the manual install seemed brittle.
  • Some regtests failed with cuSOLVERMp.
  • We can only afford one CUDA dashboard tests and I didn't want to lose coverage of ELPA's CUDA backend.

In all likelihood things have improve over the last year. So, maybe you want to give it another try?

@oschuett oschuett merged commit 1018eda into cp2k:master Dec 6, 2024
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants