Skip to content

Conversation

@ssnl
Copy link
Collaborator

@ssnl ssnl commented Feb 23, 2018

See commit messages for details of each commit. Notice that the last commit changes the QR based approach to LU based.

Some comments about switching from QR to LU:

  1. QR (geqrf) is generally more stable than LU (getrf).
  2. But getrf gives specific information about matrix being singular or not. Furthermore, the backward (when det > 0) uses matrix inverse (getrf + getri), so using getrf in forward makes sense.
  3. Finally, this is also more consistent with many other libraries (including NumPy and MATLAB) that use LU factorization (getrf) to compute determinant.

Therefore, in this PR, det is computed using btrtifact (batched getrf). However, the backward still uses inverse (getrf + getri). So future work include:

  1. expose getri and cache getrf results to speed up backward.
  2. implement batched getri to support batched det methods.

@ssnl
Copy link
Collaborator Author

ssnl commented Feb 24, 2018

@pytorchbot retest this please

@ssnl ssnl force-pushed the log_det branch 5 times, most recently from b8eb01e to 4395b8e Compare February 24, 2018 20:39
@ssnl ssnl changed the title Add logdet and slogdet [wip] Add logdet and slogdet Feb 26, 2018
@ssnl ssnl force-pushed the log_det branch 2 times, most recently from f3b2d0b to f358631 Compare February 26, 2018 22:58
@ssnl ssnl changed the title [wip] Add logdet and slogdet [ready] Add logdet and slogdet Feb 26, 2018
@ssnl ssnl force-pushed the log_det branch 3 times, most recently from b696396 to 2c53660 Compare February 26, 2018 23:57
@vishwakftw
Copy link
Contributor

vishwakftw commented Feb 27, 2018

@ssnl I will have to use logdet for the implementation of the KL-divergence between two multivariate normal distributions. I would like to know how fast would this be compared to computing the determinant (using potrf and prod) and taking the log of it?

In fact, my question is motivated by this term:
image

Would I be better off by doing

  1. Compute determinants of both covariance matrices --> take abs value of the ratio --> take log
  2. Compute slogdet of covariance matrices --> take the difference
    ?

Thanks.

@soumith
Copy link
Contributor

soumith commented Feb 27, 2018

putting speed aside, the gradient formulas permitting this would prob be more numerically stable

@ssnl
Copy link
Collaborator Author

ssnl commented Feb 27, 2018

Re @vishwakftw : @soumith is right. slogdet will be more numerically stable, not only in backward but also in forward as well. QR is only roughly twice as expensive as LU factorization, but LU performs much worse on some ill-conditioned matrices.

@ssnl ssnl force-pushed the log_det branch 7 times, most recently from 59b2ad9 to 43d60a8 Compare February 27, 2018 22:42
@ssnl
Copy link
Collaborator Author

ssnl commented Feb 28, 2018

@pytorchbot retest this please

@ssnl ssnl force-pushed the log_det branch 4 times, most recently from 4ecd2fe to 338a869 Compare March 1, 2018 20:08
@ssnl ssnl force-pushed the log_det branch 3 times, most recently from a65f6f5 to 8ef991f Compare March 6, 2018 02:35

This comment was marked as off-topic.

This comment was marked as off-topic.

@ssnl
Copy link
Collaborator Author

ssnl commented Mar 7, 2018

Hold on reviewing @zou3519 . I plan to move det computation from QR to LU factorization. It would be more consistent with things like numpy matlab and also won't be having problem with the inverse in backward.

@ssnl
Copy link
Collaborator Author

ssnl commented Mar 8, 2018

this comment is moved to top

@ssnl ssnl force-pushed the log_det branch 7 times, most recently from 4759907 to aba183d Compare March 14, 2018 23:05
@ssnl
Copy link
Collaborator Author

ssnl commented Mar 16, 2018

@vishwakftw @samuela do one of you want to review this PR? Thanks in advance! :)

Copy link
Contributor

@vishwakftw vishwakftw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code looks good. Thanks!!

ssnl added 8 commits March 15, 2018 23:41
2. Previously, det can return result with incorrect sign upon seeing symmetric
   matrices. This is caused by the wrong assumption I had on SVD (when input is
   symmetric U=V^T). This fixes it.
3. Moreover, after fixing 2 now QR is always needed for det forward. So I moved
   SVD to backward call. Since this is a specific variant of SVD, it is named as
   _svd_with_positive_UV_det, with derivative.yaml entry being svd_backward.
4. Updated/added backward functions for det, logdet and slogdet, which uses
   _svd_with_positive_UV_det and svd_backward inside.
5. Optimized svd_backward:
   a. Avoid unnecessary kernels when only sigma has gradient (this is the usual
      case, and also true with *det backward functions).
   b. Fix SVD double backward by avoiding a nan.
2. Fix an incorrect check for dim_args_idx in test_autograd.py
3. Add option to only test a subset of output values, specified by
   test_output_indices, for cases like slogdet where only the
   second output is differentiable.
4. Add better doc for the test generating list.
Add a scaling to random matrices so closeness checks are more robust
use svd only for non-invertible case, so don't need the special variant anymore
@soumith soumith merged commit 940a0ab into pytorch:master Mar 16, 2018
@ssnl
Copy link
Collaborator Author

ssnl commented Mar 16, 2018

@vishwakftw @soumith Thanks!!!

@ssnl ssnl deleted the log_det branch March 16, 2018 13:40
@ssnl ssnl mentioned this pull request Mar 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants