Skip to content

test_svd_dense_opencl fails on Ubuntu 20.04 LTS using NVIDIA OpenCL on AWS g3s.xlarge instance #3147

@christopher-gee

Description

@christopher-gee

Test 340 test_svd_dense_opencl fails with ArrayFire compiled on Ubuntu 20.04 LTS using NVIDIA OpenCL on AWS g3s.xlarge instance.

Description

  • Did you build ArrayFire yourself or did you use the official installers?
    [ArrayFire was built from source code on AWS EC2 g3s instance (NVIDIA Tesla M60 GPU)]
  • Which backend is experiencing this issue? (CPU, CUDA, OpenCL)
    [OpenCL]
  • Do you have a workaround?
    [No]
  • Can the bug be reproduced reliably on your system?
    [Yes]
  • A clear and concise description of what you expected to happen.
    [Expect that Test 340 will pass successfully]
  • Run your executable with AF_TRACE=all and AF_PRINT_ERRORS=1 environment
    variables set.
    [Variables were set, but no additional output was produced]
  • Screenshot or terminal output of the results`
ubuntu@ip-172-31-2-118:~/git/arrayfire/build$ make test ARGS="-V -R test_svd_dense_opencl"
Running tests...
UpdateCTestConfiguration  from :/home/ubuntu/git/arrayfire/build/DartConfiguration.tcl
Parse Config file:/home/ubuntu/git/arrayfire/build/DartConfiguration.tcl
 Add coverage exclude regular expressions.
 Add coverage exclude: test
 Add coverage exclude: extern/.*
 Add coverage exclude: test/mmio/.*
 Add coverage exclude: src/backend/cpu/threads/.*
 Add coverage exclude: src/backend/cuda/cub/.*
 Add coverage exclude: cl2.hpp
 Add coverage exclude: CMakeModules/.*
SetCTestConfiguration:CMakeCommand:/usr/bin/cmake
UpdateCTestConfiguration  from :/home/ubuntu/git/arrayfire/build/DartConfiguration.tcl
Parse Config file:/home/ubuntu/git/arrayfire/build/DartConfiguration.tcl
Test project /home/ubuntu/git/arrayfire/build
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 340
	Start 340: test_svd_dense_opencl

340: Test command: /home/ubuntu/git/arrayfire/build/test/svd_dense_opencl
340: Test timeout computed to be: 1500
340: [==========] Running 29 tests from 5 test cases.
340: [----------] Global test environment set-up.
340: [----------] 7 tests from svd/0, where TypeParam = float
340: [ RUN      ] svd/0.Square
340: [       OK ] svd/0.Square (782 ms)
340: [ RUN      ] svd/0.Rect0
340: [       OK ] svd/0.Rect0 (265 ms)
340: [ RUN      ] svd/0.Rect1
340: [       OK ] svd/0.Rect1 (274 ms)
340: [ RUN      ] svd/0.InPlaceSquare
340: [       OK ] svd/0.InPlaceSquare (581 ms)
340: [ RUN      ] svd/0.InPlaceRect0
340: [       OK ] svd/0.InPlaceRect0 (266 ms)
340: [ RUN      ] svd/0.InPlaceSameResultsSquare
340: [       OK ] svd/0.InPlaceSameResultsSquare (0 ms)
340: [ RUN      ] svd/0.InPlaceSameResultsRect0
340: [       OK ] svd/0.InPlaceSameResultsRect0 (1 ms)
340: [----------] 7 tests from svd/0 (2169 ms total)
340:
340: [----------] 7 tests from svd/1, where TypeParam = double
340: [ RUN      ] svd/1.Square
340: [       OK ] svd/1.Square (720 ms)
340: [ RUN      ] svd/1.Rect0
340: [       OK ] svd/1.Rect0 (330 ms)
340: [ RUN      ] svd/1.Rect1
340: [       OK ] svd/1.Rect1 (319 ms)
340: [ RUN      ] svd/1.InPlaceSquare
340: [       OK ] svd/1.InPlaceSquare (712 ms)
340: [ RUN      ] svd/1.InPlaceRect0
340: [       OK ] svd/1.InPlaceRect0 (325 ms)
340: [ RUN      ] svd/1.InPlaceSameResultsSquare
340: [       OK ] svd/1.InPlaceSameResultsSquare (1 ms)
340: [ RUN      ] svd/1.InPlaceSameResultsRect0
340: [       OK ] svd/1.InPlaceSameResultsRect0 (1 ms)
340: [----------] 7 tests from svd/1 (2408 ms total)
340:
340: [----------] 7 tests from svd/2, where TypeParam = af::af_cfloat
340: [ RUN      ] svd/2.Square
340: unknown file: Failure
340: C++ exception with description "ArrayFire Exception (Internal error:998):
340: In function svd
340: In file src/backend/opencl/svd.cpp:149
340: LAPACKE Error (-5)
340:  0# 0x00007FC334FA2F2A in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  1# 0x00007FC334FA3694 in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  2# af_svd in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  3# af::svd(af::array&, af::array&, af::array&, af::array const&) in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  4# 0x0000564A27190295 in /home/ubuntu/git/arrayfire/build/test/svd_dense_opencl
340:  5# void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) in /home/ubuntu/git/arrayfire/build/extern/googletest-build/googlemock/gtest/libgtest.so
340:  6# testing::Test::Run() in /home/ubuntu/git/arrayfire/build/extern/googletest-build/googlemock/gtest/libgtest.so
340:  7# testing::TestInfo::Run() in /home/ubuntu/git/arrayfire/build/exter" thrown in the test body.
340: [  FAILED  ] svd/2.Square, where TypeParam = af::af_cfloat (240 ms)
340: [ RUN      ] svd/2.Rect0
340: unknown file: Failure
340: C++ exception with description "ArrayFire Exception (Internal error:998):
340: In function svd
340: In file src/backend/opencl/svd.cpp:149
340: LAPACKE Error (-5)
340:  0# 0x00007FC334FA2F2A in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  1# 0x00007FC334FA3694 in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  2# af_svd in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  3# af::svd(af::array&, af::array&, af::array&, af::array const&) in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  4# 0x0000564A27190295 in /home/ubuntu/git/arrayfire/build/test/svd_dense_opencl
340:  5# void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) in /home/ubuntu/git/arrayfire/build/extern/googletest-build/googlemock/gtest/libgtest.so
340:  6# testing::Test::Run() in /home/ubuntu/git/arrayfire/build/extern/googletest-build/googlemock/gtest/libgtest.so
340:  7# testing::TestInfo::Run() in /home/ubuntu/git/arrayfire/build/exter" thrown in the test body.
340: [  FAILED  ] svd/2.Rect0, where TypeParam = af::af_cfloat (114 ms)
340: [ RUN      ] svd/2.Rect1
340: unknown file: Failure
340: C++ exception with description "ArrayFire Exception (Internal error:998):
340: In function svd
340: In file src/backend/opencl/svd.cpp:149
340: LAPACKE Error (-5)
340:  0# 0x00007FC334FA2F2A in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  1# 0x00007FC334FA355D in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  2# af_svd in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  3# af::svd(af::array&, af::array&, af::array&, af::array const&) in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  4# 0x0000564A27190295 in /home/ubuntu/git/arrayfire/build/test/svd_dense_opencl
340:  5# void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) in /home/ubuntu/git/arrayfire/build/extern/googletest-build/googlemock/gtest/libgtest.so
340:  6# testing::Test::Run() in /home/ubuntu/git/arrayfire/build/extern/googletest-build/googlemock/gtest/libgtest.so
340:  7# testing::TestInfo::Run() in /home/ubuntu/git/arrayfire/build/exter" thrown in the test body.
340: [  FAILED  ] svd/2.Rect1, where TypeParam = af::af_cfloat (119 ms)
340: [ RUN      ] svd/2.InPlaceSquare
340: unknown file: Failure
340: C++ exception with description "ArrayFire Exception (Internal error:998):
340: In function svd
340: In file src/backend/opencl/svd.cpp:149
340: LAPACKE Error (-5)
340:  0# 0x00007FC334FA2F2A in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  1# af_svd_inplace in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  2# af::svdInPlace(af::array&, af::array&, af::array&, af::array&) in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  3# 0x0000564A2718EB3A in /home/ubuntu/git/arrayfire/build/test/svd_dense_opencl
340:  4# void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) in /home/ubuntu/git/arrayfire/build/extern/googletest-build/googlemock/gtest/libgtest.so
340:  5# testing::Test::Run() in /home/ubuntu/git/arrayfire/build/extern/googletest-build/googlemock/gtest/libgtest.so
340:  6# testing::TestInfo::Run() in /home/ubuntu/git/arrayfire/build/extern/googletest-build/googlemock/gtest/libgtest.so
340:  7# testing::TestCase::Run() in /home/" thrown in the test body.
340: [  FAILED  ] svd/2.InPlaceSquare, where TypeParam = af::af_cfloat (231 ms)
340: [ RUN      ] svd/2.InPlaceRect0
340: unknown file: Failure
340: C++ exception with description "ArrayFire Exception (Internal error:998):
340: In function svd
340: In file src/backend/opencl/svd.cpp:149
340: LAPACKE Error (-5)
340:  0# 0x00007FC334FA2F2A in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  1# af_svd_inplace in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  2# af::svdInPlace(af::array&, af::array&, af::array&, af::array&) in /home/ubuntu/git/arrayfire/build/src/backend/opencl/libafopencl.so.3
340:  3# 0x0000564A2718EB3A in /home/ubuntu/git/arrayfire/build/test/svd_dense_opencl
340:  4# void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) in /home/ubuntu/git/arrayfire/build/extern/googletest-build/googlemock/gtest/libgtest.so
340:  5# testing::Test::Run() in /home/ubuntu/git/arrayfire/build/extern/googletest-build/googlemock/gtest/libgtest.so
340:  6# testing::TestInfo::Run() in /home/ubuntu/git/arrayfire/build/extern/googletest-build/googlemock/gtest/libgtest.so
340:  7# testing::TestCase::Run() in /home/" thrown in the test body.
340: [  FAILED  ] svd/2.InPlaceRect0, where TypeParam = af::af_cfloat (115 ms)
340: [ RUN      ] svd/2.InPlaceSameResultsSquare
340: [       OK ] svd/2.InPlaceSameResultsSquare (1 ms)
340: [ RUN      ] svd/2.InPlaceSameResultsRect0
340: [       OK ] svd/2.InPlaceSameResultsRect0 (1 ms)
340: [----------] 7 tests from svd/2 (821 ms total)
340:
340: [----------] 7 tests from svd/3, where TypeParam = af::af_cdouble
340: [ RUN      ] svd/3.Square
340: [       OK ] svd/3.Square (1799 ms)
340: [ RUN      ] svd/3.Rect0
340: [       OK ] svd/3.Rect0 (737 ms)
340: [ RUN      ] svd/3.Rect1
340: [       OK ] svd/3.Rect1 (734 ms)
340: [ RUN      ] svd/3.InPlaceSquare
340: [       OK ] svd/3.InPlaceSquare (1696 ms)
340: [ RUN      ] svd/3.InPlaceRect0
340: [       OK ] svd/3.InPlaceRect0 (698 ms)
340: [ RUN      ] svd/3.InPlaceSameResultsSquare
340: [       OK ] svd/3.InPlaceSameResultsSquare (0 ms)
340: [ RUN      ] svd/3.InPlaceSameResultsRect0
340: [       OK ] svd/3.InPlaceSameResultsRect0 (0 ms)
340: [----------] 7 tests from svd/3 (5665 ms total)
340:
340: [----------] 1 test from svd
340: [ RUN      ] svd.InPlaceRect0_Exception
340: [       OK ] svd.InPlaceRect0_Exception (1 ms)
340: [----------] 1 test from svd (1 ms total)
340:
340: [----------] Global test environment tear-down
340: [==========] 29 tests from 5 test cases ran. (11064 ms total)
340: [  PASSED  ] 24 tests.
340: [  FAILED  ] 5 tests, listed below:
340: [  FAILED  ] svd/2.Square, where TypeParam = af::af_cfloat
340: [  FAILED  ] svd/2.Rect0, where TypeParam = af::af_cfloat
340: [  FAILED  ] svd/2.Rect1, where TypeParam = af::af_cfloat
340: [  FAILED  ] svd/2.InPlaceSquare, where TypeParam = af::af_cfloat
340: [  FAILED  ] svd/2.InPlaceRect0, where TypeParam = af::af_cfloat
340:
340:  5 FAILED TESTS
1/1 Test #340: test_svd_dense_opencl ............***Failed   11.12 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) =  11.13 sec

The following tests FAILED:
		340 - test_svd_dense_opencl (Failed)
Run command: ./test/print_info
ArrayFire v3.9.0 (OpenCL, 64-bit Linux, build 9738a316)
[0] NVIDIA: NVIDIA Tesla M60, 7618 MB
Errors while running CTest
make: *** [Makefile:97: test] Error 8
ubuntu@ip-172-31-2-118:~/git/arrayfire/build$

Reproducible Code and/or Steps

  • Used Ubuntu 20.04 LTS AMI on AWS EC2 g3s.xlarge instance (NVIDIA Tesla M60 GPU)
  • Followed the steps for the Build Process on Linux and installed NVIDIA CUDA 11.3 as the OpenCL SDK to satisfy the OpenCL backend dependency.
  • Installed the following packages
sudo apt update
sudo apt upgrade
sudo apt-get install -y build-essential git cmake libfreeimage-dev
sudo apt-get install -y cmake-curses-gui
sudo apt-get install -y clinfo
sudo apt-get install -y opencl-headers
sudo apt-get install -y libboost-all-dev
sudo apt-get install -y ocl-icd-opencl-dev
sudo apt-get install -y libglfw3-dev
sudo apt-get install -y libopenblas-dev libfftw3-dev liblapacke-dev
sudo apt-get install -y libmkl-dev
  • Installed NVIDIA CUDA 11.3 per the following instructions
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub 
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
sudo apt-get update
sudo apt-get -y install cuda
  • CMake Terminal Output
-- The C compiler identification is GNU 9.3.0
-- The CXX compiler identification is GNU 9.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found CUDA: /usr/local/cuda (found suitable version "11.3", minimum required is "9.0")
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.1")
-- Could NOT find cuDNN (missing: cuDNN_LINK_LIBRARY cuDNN_INCLUDE_DIRS) (Required is at least version "4.0")
-- Looking for CL_VERSION_2_0
-- Looking for CL_VERSION_2_0 - found
-- Found OpenCL: /usr/lib/x86_64-linux-gnu/libOpenCL.so (found suitable version "2.0", minimum required is "1.2")
-- Found OpenGL: /usr/lib/x86_64-linux-gnu/libGL.so
-- Found FreeImage: /usr/include
-- Checking for module 'fftw3'
--   Found fftw3, version 3.3.8
-- Found FFTW: /usr/include
-- Checking for module 'cblas'
--   No package 'cblas' found
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of void*
-- Check size of void* - done
-- Checking for [Accelerate]
-- Checking for [vecLib]
-- Checking for [cblas - atlas]
-- Includes found
-- Checking for [openblas]
-- Includes found
-- Looking for cblas_dgemm
-- Looking for cblas_dgemm - found
-- CBLAS Symbols FOUND
-- CBLAS library found
-- Found LAPACKE: /usr/lib/x86_64-linux-gnu/liblapacke.so
-- Found LAPACK: /usr/include
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE)
-- Check size of int
-- Check size of int - done
-- MKL: Thread Layer(Intel OpenMP) Interface(4-byte Integer)
-- Could NOT find MKL: Source the compilervars.sh or mklvars.sh scripts included with your installation of MKL. This script searches for the libraries in MKLROOT, LIBRARY_PATHS(Linux), and LIB(Windows) environment variables (missing: MKL_INCLUDE_DIR)
-- Could NOT find MKL: Source the compilervars.sh or mklvars.sh scripts included with your installation of MKL. This script searches for the libraries in MKLROOT, LIBRARY_PATHS(Linux), and LIB(Windows) environment variables (missing: MKL_INCLUDE_DIR)
-- Found Boost: /usr/lib/x86_64-linux-gnu/cmake/Boost-1.71.0/BoostConfig.cmake (found suitable version "1.71.0", minimum required is "1.66")
-- Performing Test has_ignored_attributes_flag
-- Performing Test has_ignored_attributes_flag - Success
-- Performing Test has_all_warnings_flag
-- Performing Test has_all_warnings_flag - Success
-- Autodetected CUDA architecture(s):  5.2
-- CUDA_architecture_build_targets: Auto ( 5.2  )
-- Performing Test group_flags
-- Performing Test group_flags - Success
-- Found OpenCL: /usr/lib/x86_64-linux-gnu/libOpenCL.so (found version "2.0")
-- UNICODE feature disabled on linux
-- 64bit build - FIND_LIBRARY_USE_LIB64_PATHS TRUE
-- Found Boost: /usr/lib/x86_64-linux-gnu/cmake/Boost-1.71.0/BoostConfig.cmake (found suitable version "1.71.0", minimum required is "1.33.0") found components: program_options
-- Boost_PROGRAM_OPTIONS_LIBRARY: Boost::program_options
-- Found OpenCL: /usr/local/cuda/lib64/libOpenCL.so
-- Found FFTW: /usr/lib/x86_64-linux-gnu/libfftw3f.so;/usr/lib/x86_64-linux-gnu/libfftw3.so
-- Detected GNU fortran compiler.
-- CMAKE_CXX_COMPILER flags: -m64 -pthread
-- CMAKE_CXX_COMPILER debug flags: -g
-- CMAKE_CXX_COMPILER release flags: -O3 -DNDEBUG
-- CMAKE_CXX_COMPILER relwithdebinfo flags: -O2 -g -DNDEBUG
-- CMAKE_EXE_LINKER link flags:
FFT clients will NOT be built
GoogleTest unit tests will NOT be built
FFT callback client will NOT be built
-- Found OpenCL: /usr/lib/x86_64-linux-gnu/libOpenCL.so (found version "2.0")
-- Found PythonInterp: /usr/bin/python3.8 (found version "3.8.5")
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Performing Test pthread_flag
-- Performing Test pthread_flag - Success
-- Configuring done
CMake Warning at src/backend/opencl/CMakeLists.txt:42 (add_library):
  Cannot generate a safe runtime search path for target afopencl because
  files in some directories may conflict with libraries in implicit
  directories:

    runtime library [libOpenCL.so.1] in /usr/lib/x86_64-linux-gnu may be hidden by files in:
      /usr/local/cuda/lib64

  Some of these libraries may not be found correctly.


-- Generating done
-- Build files have been written to: /home/ubuntu/git/arrayfire/build

System Information

  1. ArrayFire version
    [Master Branch, commit 9738a31]
  2. Devices installed on the system
    [NVIDIA Tesla M60 GPU]
  3. Output from the following scripts:
    Linux:
ubuntu@ip-172-31-2-118:~/git/arrayfire/build$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.2 LTS
Release:        20.04
Codename:       focal

ubuntu@ip-172-31-2-118:~/git/arrayfire/build$ if command -v nvidia-smi >/dev/null; then
>   nvidia-smi --query-gpu="name,memory.total,driver_version" --format=csv -i 0
> else
>   echo "nvidia-smi not found"
> fi
name, memory.total [MiB], driver_version
NVIDIA Tesla M60, 7618 MiB, 465.19.01

ubuntu@ip-172-31-2-118:~/git/arrayfire/build$ if command -v /opt/rocm/bin/rocm-smi >/dev/null; then
>   /opt/rocm/bin/rocm-smi --showproductname
> else
>   echo "rocm-smi not found."
> fi
rocm-smi not found.

ubuntu@ip-172-31-2-118:~/git/arrayfire/build$ if command -v clinfo > /dev/null; then
>   clinfo
> else
>   echo "clinfo not found."
> fi
Number of platforms                               1
  Platform Name                                   NVIDIA CUDA
  Platform Vendor                                 NVIDIA Corporation
  Platform Version                                OpenCL 3.0 CUDA 11.3.55
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_icd cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_khr_gl_event cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_nv_kernel_attribute cl_khr_device_uuid
  Platform Host timer resolution                  0ns
  Platform Extensions function suffix             NV

  Platform Name                                   NVIDIA CUDA
Number of devices                                 1
  Device Name                                     NVIDIA Tesla M60
  Device Vendor                                   NVIDIA Corporation
  Device Vendor ID                                0x10de
  Device Version                                  OpenCL 3.0 CUDA
  Driver Version                                  465.19.01
  Device OpenCL C Version                         OpenCL C 1.2
  Device Type                                     GPU
  Device Topology (NV)                            PCI-E, 00:03.6
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               16
  Max clock frequency                             1177MHz
  Compute Capability (NV)                         5.2
  Device Partition                                (core)
    Max number of sub-devices                     1
    Supported partition types                     None
    Supported affinity domains                    (n/a)
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x64
  Max work group size                             1024
  Preferred work group size multiple              32
  Warp size (NV)                                  32
  Max sub-groups per work group                   0
  Preferred / native vector sizes
    char                                                 1 / 1
    short                                                1 / 1
    int                                                  1 / 1
    long                                                 1 / 1
    half                                                 0 / 0        (n/a)
    float                                                1 / 1
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  Yes
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
  Address bits                                    64, Little-Endian
  Global memory size                              7988903936 (7.44GiB)
  Error Correction support                        Yes
  Max memory allocation                           1997225984 (1.86GiB)
  Unified memory for Host and Device              No
  Integrated memory (NV)                          No
  Shared Virtual Memory (SVM) capabilities        (core)
    Coarse-grained buffer sharing                 Yes
    Fine-grained buffer sharing                   No
    Fine-grained system sharing                   No
    Atomics                                       No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       4096 bits (512 bytes)
  Preferred alignment for atomics
    SVM                                           0 bytes
    Global                                        0 bytes
    Local                                         0 bytes
  Max size for global variable                    0
  Preferred total size of global vars             0
  Global Memory cache type                        Read/Write
  Global Memory cache size                        786432 (768KiB)
  Global Memory cache line size                   128 bytes
  Image support                                   Yes
    Max number of samplers per kernel             32
    Max size for 1D images from buffer            268435456 pixels
    Max 1D or 2D image array size                 2048 images
    Max 2D image size                             16384x16384 pixels
    Max 3D image size                             4096x4096x4096 pixels
    Max number of read image args                 256
    Max number of write image args                16
    Max number of read/write image args           0
  Max number of pipe args                         0
  Max active pipe reservations                    0
  Max pipe packet size                            0
  Local memory type                               Local
  Local memory size                               49152 (48KiB)
  Registers per block (NV)                        65536
  Max number of constant args                     9
  Max constant buffer size                        65536 (64KiB)
  Max size of kernel argument                     4352 (4.25KiB)
  Queue properties (on host)
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Queue properties (on device)
    Out-of-order execution                        No
    Profiling                                     No
    Preferred size                                0
    Max size                                      0
  Max queues on device                            0
  Max events on device                            0
  Prefer user sync for interop                    No
  Profiling timer resolution                      1000ns
  Execution capabilities
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Sub-group independent forward progress        No
    Kernel execution timeout (NV)                 No
  Concurrent copy and kernel execution (NV)       Yes
    Number of async copy engines                  2
    IL version                                    (n/a)
  printf() buffer size                            1048576 (1024KiB)
  Built-in kernels                                (n/a)
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_icd cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_khr_gl_event cl_nv_create_buffer cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_nv_kernel_attribute cl_khr_device_uuid

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              Success [NV]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  Invalid device type for platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No platform
        NOTE:   your OpenCL library only supports OpenCL 2.2,
                but some installed platforms support OpenCL 3.0.
                Programs using 3.0 features may crash
                or behave unexpectedly

Checklist

  • [NO] Using the latest available ArrayFire release
  • [YES] GPU drivers are up to date

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions