Skip to content

slow addmm which comes from bug with CPU backend #5047

@MlWoo

Description

@MlWoo

I think the addmm implementation must have a bug.
Actually, I am confused by the the matrix tranpose. If a matrix with size m x k, why the matrix is transpose if stride[1] == 1 && LDC_COND(r_->size[1], r_->size[0], r_->stride[0])? The stride in the second dimension is 1 means that the matrix is no-transpose. do I misunderstand that? Or do I miss some usage rules which are established? Or does the code obey CblasColMajor?

However, the comment conflict with the other. And the corresponding code is also not right.

If I misunderstand the matrix transpose, the code between L2006~L2025 should be below:

  /* m1 */
  /* Need ldm1_ >= max(1, (transpose_m1 == 't' ? k : m)) */
  if(m1->stride[(transpose_r == 'n' ? 0 : 1)] == 1 &&
     m1->stride[(transpose_r == 'n' ? 1 : 0)] >= THMax(1, m))
  {
    transpose_m1 = 'n';
    m1_ = m1;
  }
  else if(m1->stride[(transpose_r == 'n' ? 1 : 0)] == 1 &&
          m1->stride[(transpose_r == 'n' ? 0 : 1)] >= THMax(1, k))
  {
    transpose_m1 = 't';
    m1_ = m1;
  }
  else
  {
    transpose_m1 = (transpose_r == 'n' ? 't' : 'n');
    m1_ = THTensor_(newContiguous)(m1);
    free_m1 = 1;
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions