2

I am using OpenCL 1.1. I am going to run my code on all of my gpus and all of my cpus. So, as the synchronization on two different contexts is hard to do, I wanted to create a context which contains all CPUs and GPUs as devices. So, First of all I am going to get all the platforms, then the devices related to each platform and then get CPU & GPU devices and store them in seperate vectors. Then afterward, to make the context I am going to create a vector made by all the CPU and GPU Devices. Then, I will call the clCreateContext. It will work fine but afterwards, when I want to create command queues for each device seprately, it always give me:

OpenCL call falls with error -34.

The code is as follows:

  cl_int error = CL_SUCCESS;
  cl_uint num_platforms;
  clGetPlatformIDs(0, nullptr, &num_platforms);
  if (num_platforms == 0){
    std::cout << "Cannot find any platform.\n";
    return;
  }
  platform.resize(num_platforms);
  error = clGetPlatformIDs(num_platforms, platform.data(), nullptr);
  checkError(error);

  for (cl_uint i = 0; i < num_platforms; i++){
    std::string platform_name;
    size_t platform_name_len;
    clGetPlatformInfo(platform[i], CL_PLATFORM_NAME, 0, nullptr, &platform_name_len);
    platform_name.resize(platform_name_len);
    clGetPlatformInfo(platform[i], CL_PLATFORM_NAME, platform_name_len, const_cast<char*>(platform_name.data()), nullptr);
    std::cout << "[" << i << "]\t" << platform_name << std::endl;

    std::vector<cl_device_id> devices(0);
    cl_uint num_cpus = 0, num_gpus = 0;
    error = clGetDeviceIDs(platform[i], CL_DEVICE_TYPE_CPU, 0, nullptr, &num_cpus);
    error = clGetDeviceIDs(platform[i], CL_DEVICE_TYPE_GPU, 0, nullptr, &num_gpus);
    devices.resize(num_cpus);

    std::cout << "\tCPUS: \n";
    error = clGetDeviceIDs(platform[i], CL_DEVICE_TYPE_CPU, num_cpus, devices.data(), nullptr);
    for (cl_uint d = 0; d < num_cpus; d++){
      std::string device_name;
      size_t device_name_len;
      clGetDeviceInfo(devices[d], CL_DEVICE_NAME, 0, nullptr, &device_name_len);
      device_name.resize(device_name_len);
      clGetDeviceInfo(devices[d], CL_DEVICE_NAME, device_name_len, const_cast<char*>(device_name.data()), nullptr);
      std::cout << "\t\t[" << d << "]\t" << device_name << std::endl;

      cpu_devices.push_back(devices[d]);
    }

    std::cout << "\tGPUS: \n";
    devices.resize(num_gpus);
    error = clGetDeviceIDs(platform[i], CL_DEVICE_TYPE_GPU, num_gpus, devices.data(), nullptr);
    for (cl_uint d = 0; d < num_gpus; d++){
      std::string device_name;
      size_t device_name_len;
      clGetDeviceInfo(devices[d], CL_DEVICE_NAME, 0, nullptr, &device_name_len);
      device_name.resize(device_name_len);
      clGetDeviceInfo(devices[d], CL_DEVICE_NAME, device_name_len, const_cast<char*>(device_name.data()), nullptr);
      std::cout << "\t\t[" << d << "]\t" << device_name << std::endl;

      gpu_devices.push_back(devices[d]);
    }
  }

  std::vector<cl_device_id> devices;
  for (size_t i = 0; i < cpu_devices.size(); i++)
    devices.push_back(cpu_devices[i]);
  for (size_t i = 0; i < gpu_devices.size(); i++)
    devices.push_back(gpu_devices[i]);

  ctx = clCreateContext(NULL, static_cast<cl_uint>(devices.size()), devices.data(), nullptr, nullptr, nullptr);

  cpu_devices_queue.resize(cpu_devices.size());
  for (size_t i = 0; i < cpu_devices.size(); i++){
    cpu_devices_queue[i] = clCreateCommandQueue(ctx, cpu_devices[i], 0, &error);
    checkError(error);
  }

  gpu_devices_queue.resize(gpu_devices.size());
  for (size_t i = 0; i < gpu_devices.size(); i++){
    gpu_devices_queue[i] = clCreateCommandQueue(ctx, gpu_devices[i], 0, &error);
    checkError(error);
  }

1 Answer 1

3

An OpenCL context can only encapsulate devices from a single platform, and cannot be created using devices from two or more different platforms.

You are not actually checking whether your call to clCreateContext succeeds. If you checked the return value or the error code, you would likely see that it was in fact failing. This is why when you later use that context in your call to clCreateCommandQueue, you receive error -34 (CL_INVALID_CONTEXT).

Sign up to request clarification or add additional context in comments.

6 Comments

So, it is not possible to use all the cpus and gpus of a computer in a same context and I should create context as much as platform as I have. Then, synchrnoize using clFlush()?
@mmostajab Correct, you need to create multiple context objects if you wish use devices from multiple platforms within the same program, and you will need to manually synchronise via the host and manually copy memory between the devices.
Is it something that we have to do only in opencl 1.1 or we have to do that in opencl 2.0? Because then having shared virtual memory between cpu and gpu is not possible.
You can certainly use SVM between the host (CPU) and a GPU device in OpenCL 2.0 - although this doesn't treat the CPU as an OpenCL device. If you have an AMD or Intel GPU, you can use the same OpenCL platform on the CPU and GPU.
So, if we use use the CPU as the opencl device, it won't use SIMD instruction running on CPU? I was thinking that it will become very faster compared to using just multicore or multi-threading solutions.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.