Autograd Doc for Complex Numbers #41012

anjali411 · 2020-07-06T16:18:54Z

Stack from ghstack:

Autograd Doc for Complex Numbers #41012 Autograd Doc for Complex Numbers

Differential Revision: D22476911

[ghstack-poisoned]

ghstack-source-id: 850637d Pull Request resolved: #41012

dr-ci · 2020-07-06T16:56:39Z

💊 CI failures summary and remediations

As of commit 2db4435 (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 36 times.

[ghstack-poisoned]

ghstack-source-id: 216fb84 Pull Request resolved: #41012

docs/source/notes/autograd.rst

[ghstack-poisoned]

ghstack-source-id: 23fe293 Pull Request resolved: #41012

[ghstack-poisoned]

ghstack-source-id: 8d199c0 Pull Request resolved: #41012

ezyang · 2020-07-07T16:02:27Z

docs/source/notes/autograd.rst

+What happens if I call backward() on a complex scalar?
+******************************************************
+
+    1. For holomorphic functions, you get the same result as expected from using Cauchy-Riemann equations.


It's not clear what "same result" means here

For holomorphic functions, the gradient can be fully represented with complex numbers due to the Cauchy-Riemann equations

ezyang · 2020-07-07T16:03:28Z

docs/source/notes/autograd.rst

+   PyTorch follows `_JAX's <https://jax.readthedocs.io/en/latest/notebooks/autodiff_cookbook.html#Complex-numbers-and-differentiation>`
+   convention for autograd for Complex Numbers.
+
+    For a function :math:`F: C → C`


Suppose we have a function F: C -> C which we can decompose into functions u and v which compute the real and imaginary parts of the function:

ezyang · 2020-07-07T16:03:49Z

docs/source/notes/autograd.rst

+            x, y = real(z), imag(z)
+            return u(x, y) + v(x, y) * 1j
+
+    The JVP and VJP for function :math:`F` are defined as:


for function F at (x, y)

ezyang · 2020-07-07T16:05:09Z

docs/source/notes/autograd.rst

+
+        def VJP(cotangent):
+            c, d = real(cotangent), imag(cotangent)
+            return \begin{bmatrix} c & -d \end{bmatrix} * J * \begin{bmatrix} 1 \\ -i \end{bmatrix}


In PyTorch, the VJP is mostly what we care about, as it is the computation performed when we do backwards mode automatic differentiation. Notice that d and i are negated in the formula above.

I might suggest absorbing "How are the JVP and VJP defined for :math:R^2 -> C and :math:C -> R^2 functions?" section into this one. The structure then is "Here is the general definition", and then "Here is a particular example"

I just felt it would be cleaner to have them in separate sections primarily because not everyone looking to get information about autograd for complex functions would be interested in cross domain functions

ezyang · 2020-07-07T16:12:29Z

docs/source/notes/autograd.rst

+******************************************************
+
+    1. For holomorphic functions, you get the same result as expected from using Cauchy-Riemann equations.
+    2. For non-holomorphic functions, the partial derivatives of :math:`v(x, y)` are discarded.


of the imaginary part of the function (v(x, y) above) are discarded (e.g., this is equivalent to dropping the imaginary part of the loss before performing a backwards). To get the gradient with respect to the imaginary components of the function, you must explicitly specify gradient=torch.tensor(1j))

yeah that's easier to read. updated!
not sure what you mean by:

To get the gradient with respect to the imaginary components of the function, you must explicitly specify gradient=torch.tensor(1j))

do you mean to say for any other desired behavior, specify the grad_out accordingly?

ezyang · 2020-07-07T17:11:16Z

docs/source/notes/autograd.rst

+How are the JVP and VJP defined for :math:`R^2 -> C` and :math:`C -> R^2` functions?
+************************************************************************************
+
+    The JVP and VJP for a :math:`f1: C → R^2` are defined as:


I'm still a little confused about the function of this section. Are we trying to explain why a conjugation occurs when we define the vjp for view_as_complex and view_as_rule? If so, it feels like it would be more direct if we directly talked about that particular case.

The goal here was to specify the formulas being used in view_as_real and view_as_complex yeah and then also generally give people an idea on how they can define their own backward for similar functions or others.

[ghstack-poisoned]

docs/source/notes/autograd.rst

albanD · 2020-07-07T20:38:14Z

docs/source/notes/autograd.rst

+
+        def JVP(tangent):
+            c, d = real(tangent), imag(tangent)
+            return [1, i]^T * J * [c, d]


I think you have mixed use of i and 1j throughout, you should update to use a single one all over. And maybe at the beginning add a quick "(in this document, we use blah to define the imaginary number)".

You're still mixing j and 1j, also I can't see where you defined it?

docs/source/notes/autograd.rst

albanD · 2020-07-07T20:42:56Z

docs/source/notes/autograd.rst

+**************************************************
+
+For a function F: V → W, where are V and W are vector spaces. The output of
+the Vector-Jacobian Product :math:`VJP : V → (W^* → V^*)` is a linear map


nit the VJP you define above is not this function here.
The one you define above is directly the linear mapping (because you hard coded the input in space V directly into J).

oh yeah good catch updated!

How was this updated?

the "output of", I imagine

albanD · 2020-07-07T20:44:34Z

docs/source/notes/autograd.rst

+the Vector-Jacobian Product :math:`VJP : V → (W^* → V^*)` is a linear map
+from :math:`W^* → V^*` (explained in `Chapter 4 of Dougal’s thesis <https://dougalmaclaurin.com/phd-thesis.pdf>`_).
+
+The negative signs in the above VJP computation are due to conjugation. The first


Not sure what this paragraph brings?
Why do you need the conjugation?
If the thesis contains all the details, maybe just replace this paragraph by a sentence pointing to the thesis for more details.

you need conjugation because the vectors from the dual space as explained below

docs/source/notes/autograd.rst

albanD · 2020-07-07T20:49:24Z

docs/source/notes/autograd.rst

+What happens if I call backward() on a complex scalar?
+******************************************************
+
+For geneneral ℂ→ℂ functions, the Jacobian has 4 real-valued degrees of freedom (as in the 2x2 Jacobian matrices above),


I think this is missing one extra step:
The backward in pytorch does not compute a jacobian.

I think you want something that mentions that for R->R and C->R, backward() (with no argument) computes the full gradient of the function.
But for C->C functions, this is not case...

Yeah agreed but we are not saying the backward is computing the grad. We are just commenting on the Jacobian that's used to summarize that why the gradient for holomorphic functions can still be represented a s a complex number

Ok, then I guess the title of the section is what confused me here.

docs/source/notes/autograd.rst

[ghstack-poisoned]

ghstack-source-id: 4386973 Pull Request resolved: #41012

[ghstack-poisoned]

ghstack-source-id: 8fd0118 Pull Request resolved: #41012

albanD · 2020-07-09T17:03:10Z

docs/source/notes/autograd.rst

+            x, y = real(z), imag(z)
+            return u(x, y) + v(x, y) * 1j
+
+where *1j* is a unit imaginary number.


Why bold and not in math here?

albanD · 2020-07-09T17:04:52Z

docs/source/notes/autograd.rst

+    .. math::
+
+        J = \begin{bmatrix}
+            \partial_0u(x, y) & \partial_1u(x, y)\\


Is \partial_0 defined?
Maybe \frac{\partial u(x, y)}{\partial x} ?

docs/source/notes/autograd.rst

albanD · 2020-07-09T17:06:55Z

docs/source/notes/autograd.rst

+**************************************************
+
+For a function F: V → W, where are V and W are vector spaces. The output of
+the Vector-Jacobian Product :math:`VJP : V → (W^* → V^*)` is a linear map


How was this updated?

albanD · 2020-07-09T17:08:52Z

docs/source/notes/autograd.rst

+the Vector-Jacobian Product :math:`VJP : V → (W^* → V^*)` is a linear map
+from :math:`W^* → V^*` (explained in `Chapter 4 of Dougal Maclaurin’s thesis <https://dougalmaclaurin.com/phd-thesis.pdf>`_).
+
+The negative signs in the above `VJP` computation are due to conjugation. The first


I don't think it is clear what you mean by "the first vector in the output" here ?

hmm okay rewrote it. let me know if that looks better

albanD · 2020-07-09T17:25:50Z

docs/source/notes/autograd.rst

+vector in the output returned by `VJP` for a given cotangent is a covector (:math:`\in ℂ^*`),
+and the last vector in the output is used to get the result in :math:`ℂ`
+since the final result of reverse-mode differentiation of a function is a covector belonging
+to :math:`ℂ^*` (explained in `Chapter 4 of Dougal Maclaurin’s thesis <https://dougalmaclaurin.com/phd-thesis.pdf>`_).


I am still unsure of the value of this paragraph. It seems to justify the conjugation by saying that we need the output to be in C. But if we just want the output to be in C, we could do without the conjugation no?
Is the justification here that the mapping we use from C* to C is defined based on the standard dot product on C. And so it maps vectors by doing hermitian transpose.

albanD · 2020-07-09T17:30:37Z

docs/source/notes/autograd.rst

+*******************************************************************************
+
+The gradient for a complex function is computed assuming the input function is a holomorphic function.
+This is because for general :math:`ℂ → ℂ` functions, the Jacobian has 4 real-valued degrees of freedom


Reading more about this, it feels like the "pure" C function definition does not hold.
And to be able to get quantities that match the gradients for holomorphic functions and provide sensible direction for use in gradient descent algorithms, modified definitions are needed.
Unfortunately, these definitions represent the full information about the derivatives of a general complex function using twice as many elements as a regular R -> R gradient we are used to.
And so we cannot expect to get these "extended gradients" in our framework where the .grad field is hard-coded to be the same size as the Tensor it represent the gradients of.

What do you mean by "pure" C function definition does not hold?

What I call "pure C function definition" here is what happens if you apply the derivation definition (as a limit) from real functions to complex functions: you get derivatives only for holomorphic functions.

docs/source/notes/autograd.rst

[ghstack-poisoned]

ghstack-source-id: ea1417c Pull Request resolved: #41012

ezyang · 2020-07-09T20:39:52Z

docs/source/notes/autograd.rst

+**Why is there a negative sign in the formula above?**
+******************************************************
+
+For a function F: V → W, where are V and W are vector spaces. The output of


:math:`F: V -> W`

?

Based on offline sync with @albanD, I removed this section to avoid providing fuzzy or possibly incorrect explanation

ezyang · 2020-07-09T20:42:00Z

docs/source/notes/autograd.rst

+where :math:`1j` is a unit imaginary number.
+
+We define the JVP for function :math:`F` at :math:`(x, y)` applied to a tangent
+vector :math:`c+dj \in C` as:


it's probably better to explicitly say that this is Python pseudocode. For example, you couldn't actually use * in a real PyTorch program as that would give you pointwise multiplication, not matrix product. And the bracket syntax means something else in Python, that is also not intended here either.

Or even better, make this real code using the real PyTorch operations. Then you don't have to explain the pseudocode syntax.

I updated this section to just use math notation instead. It's simpler to read and avoids the confusion

ezyang

I think the biggest issue in my mind is explanation of the pseudocode syntax, but I'm going to approve right now to move things along.

[ghstack-poisoned]

ghstack-source-id: bfc4145 Pull Request resolved: #41012

albanD

It think this has the main information we want here: the formula we should use for implementing other complex ops.
We can detail it and add more justification as we go.

albanD · 2020-07-10T15:45:00Z

docs/source/notes/autograd.rst

+*******************************************************************************
+
+The gradient for a complex function is computed assuming the input function is a holomorphic function.
+This is because for general :math:`ℂ → ℂ` functions, the Jacobian has 4 real-valued degrees of freedom


What I call "pure C function definition" here is what happens if you apply the derivation definition (as a limit) from real functions to complex functions: you get derivatives only for holomorphic functions.

facebook-github-bot · 2020-07-10T18:13:58Z

@anjali411 merged this pull request in db38487.

facebook-github-bot · 2020-07-10T18:14:04Z

@anjali411 merged this pull request in db38487.

Summary: Pull Request resolved: #41012 Test Plan: Imported from OSS Differential Revision: D22476911 Pulled By: anjali411 fbshipit-source-id: 7da20cb4312a0465272bebe053520d9911475828

complex autograd doc

842d87e

[ghstack-poisoned]

anjali411 added a commit that referenced this pull request Jul 6, 2020

complex autograd doc

12304e7

ghstack-source-id: 850637d Pull Request resolved: #41012

anjali411 requested a review from albanD July 6, 2020 16:19

anjali411 changed the title ~~complex autograd doc~~ Autograd Doc for Complex Numbers Jul 6, 2020

Update on "Autograd Doc for Complex Numbers"

dbf9280

[ghstack-poisoned]

anjali411 added a commit that referenced this pull request Jul 6, 2020

complex autograd doc

6f954cd

ghstack-source-id: 216fb84 Pull Request resolved: #41012

anjali411 requested a review from ezyang July 6, 2020 19:26

anjali411 mentioned this pull request Jul 6, 2020

Basic linear algebra for complex numbers pytorch/audio#768

Closed

ezyang reviewed Jul 6, 2020

View reviewed changes

docs/source/notes/autograd.rst Outdated Show resolved Hide resolved

ezyang reviewed Jul 6, 2020

View reviewed changes

docs/source/notes/autograd.rst Outdated Show resolved Hide resolved

Update on "Autograd Doc for Complex Numbers"

0c138b7

[ghstack-poisoned]

anjali411 added a commit that referenced this pull request Jul 7, 2020

complex autograd doc

94d3e22

ghstack-source-id: 23fe293 Pull Request resolved: #41012

Update on "Autograd Doc for Complex Numbers"

f0f6423

[ghstack-poisoned]

anjali411 added a commit that referenced this pull request Jul 7, 2020

complex autograd doc

d8c47f2

ghstack-source-id: 8d199c0 Pull Request resolved: #41012

ezyang reviewed Jul 7, 2020

View reviewed changes

Update on "Autograd Doc for Complex Numbers"

e0a5814

[ghstack-poisoned]

Update on "Autograd Doc for Complex Numbers"

f0729f7

[ghstack-poisoned]

albanD reviewed Jul 7, 2020

View reviewed changes

Update on "Autograd Doc for Complex Numbers"

29b0167

[ghstack-poisoned]

Update on "Autograd Doc for Complex Numbers"

23d1fef

[ghstack-poisoned]

anjali411 added a commit that referenced this pull request Jul 7, 2020

complex autograd doc

0d96b64

ghstack-source-id: 4386973 Pull Request resolved: #41012

Update on "Autograd Doc for Complex Numbers"

fa8e665

[ghstack-poisoned]

Update on "Autograd Doc for Complex Numbers"

2e67418

[ghstack-poisoned]

Update on "Autograd Doc for Complex Numbers"

ae4edc3

[ghstack-poisoned]

anjali411 requested a review from albanD July 9, 2020 16:23

anjali411 added a commit that referenced this pull request Jul 9, 2020

complex autograd doc

be0cb21

ghstack-source-id: 8fd0118 Pull Request resolved: #41012

albanD reviewed Jul 9, 2020

View reviewed changes

Update on "Autograd Doc for Complex Numbers"

320188f

[ghstack-poisoned]

anjali411 added a commit that referenced this pull request Jul 9, 2020

complex autograd doc

488d436

ghstack-source-id: ea1417c Pull Request resolved: #41012

anjali411 requested review from albanD and ezyang July 9, 2020 19:53

ezyang reviewed Jul 9, 2020

View reviewed changes

ezyang approved these changes Jul 9, 2020

View reviewed changes

Update on "Autograd Doc for Complex Numbers"

8b7e3e3

[ghstack-poisoned]

Update on "Autograd Doc for Complex Numbers"

2db4435

[ghstack-poisoned]

anjali411 added a commit that referenced this pull request Jul 10, 2020

complex autograd doc

d779d29

ghstack-source-id: bfc4145 Pull Request resolved: #41012

anjali411 mentioned this pull request Jul 10, 2020

[v1.6.0] Release Tracker #40472

Closed

albanD approved these changes Jul 10, 2020

View reviewed changes

facebook-github-bot closed this in db38487 Jul 10, 2020

anjali411 mentioned this pull request Jul 10, 2020

torch.complex and torch.polar #39617

Closed

facebook-github-bot added the merged label Jul 10, 2020

facebook-github-bot deleted the gh/anjali411/39/head branch July 13, 2020 17:56

mruberry added the Merged label Oct 28, 2020

Autograd Doc for Complex Numbers #41012

Autograd Doc for Complex Numbers #41012

Uh oh!

Conversation

anjali411 commented Jul 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dr-ci bot commented Jul 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anjali411 commented Jul 6, 2020 •

edited

Loading

dr-ci bot commented Jul 6, 2020 •

edited

Loading