Skip to content

[BUG] use_greedy in csrmm_nt is failing, cause performance issues (Proposed solution) #3012

@willyborn

Description

@willyborn

Bug is described in opencl/kernel/csrmm.hpp (Line 35).
Fell by accident on this code, while searching for others issues with threading.

Description

Logical error found in the code.
code snippet from opencl/kernel/csrmm.hpp (Line 73 .. 80):

    std::vector<int> count(groups_x);
    cl::Buffer *counter = bufferAlloc(count.size() * sizeof(int));
    getQueue().enqueueWriteBuffer(
        *counter, CL_TRUE, 0, count.size() * sizeof(int), (void *)count.data());

    csrmm_nt_func(cl::EnqueueArgs(getQueue(), global, local), *out.data,
                  *values.data, *rowIdx.data, *colIdx.data, M, N, *rhs.data,
                  rhs.info, alpha, beta, *counter);

The counters are used in the cl script (when USE_GREEDY is defined), to determine the s_rowId by incrementing it for each group_id(0). In the end, each group_id(0) workitem will have written to his rowId in incremental order (THREADS_PER_GROUP times).
The std::vector count(groups_x); is not initialized, resulting in random values (dependent from previous memory allocations) as basis for the s_rowId.

I assume that each element in the vector has to be initialized by 0, so that each row starts from 0 in the opencl script.

Proposed solution

std::vector count(groups_x,0);

System Information

ArrayFire 3.8.0 (master)

Checklist

  • Using the latest available ArrayFire release
  • GPU drivers are up to date

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions