Skip to content

Conversation

@anjali411
Copy link
Contributor

@anjali411 anjali411 commented Sep 21, 2020

Stack from ghstack:

TODO: Add R -> C tests in #44744 (blocked on some JIT changes)

Differential Revision: D23975361

anjali411 added a commit that referenced this pull request Sep 21, 2020
… torch.vdot

ghstack-source-id: f4e80f8
Pull Request resolved: #45074
@dr-ci
Copy link

dr-ci bot commented Sep 21, 2020

💊 CI failures summary and remediations

As of commit bc6a696 (more details on the Dr. CI page):



🕵️ 6 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single (1/6)

Step: "Set Up CI Environment After attach_workspace" (full log | diagnosis details | 🔁 rerun)

E: Unable to locate package expect-dev
E: Unable to locate package expect-dev 
+ sudo apt-get -y install moreutils expect-dev 
Reading package lists... 100%  Reading package lists... Done  
Building dependency tree... 50%  Building dependency tree         
Reading state information... 0%  Reading state information... Done  
Package moreutils is not available, but is referred to by another package. 
This may mean that the package is missing, has been obsoleted, or 
is only available from another source 
 
E: Package 'moreutils' has no installation candidate 
E: Unable to locate package expect-dev 

See CircleCI build pytorch_linux_xenial_cuda9_2_cudnn7_py3_gcc5_4_build (2/6)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Sep 29 13:54:40 caused by: Connection refused (os error 111)
Sep 29 13:54:40 ++++ extract_trap_cmd 
Sep 29 13:54:40 ++++ printf '%s\n' '' 
Sep 29 13:54:40 +++ printf '%s\n' cleanup 
Sep 29 13:54:40 ++ trap -- ' 
Sep 29 13:54:40 cleanup' EXIT 
Sep 29 13:54:40 ++ [[ pytorch-linux-xenial-cuda9.2-cudnn7-py3-gcc5.4-build != *pytorch-win-* ]] 
Sep 29 13:54:40 ++ which sccache 
Sep 29 13:54:40 ++ sccache --stop-server 
Sep 29 13:54:40 Stopping sccache server... 
Sep 29 13:54:40 error: couldn't connect to server 
Sep 29 13:54:40 caused by: Connection refused (os error 111) 
Sep 29 13:54:40 ++ true 
Sep 29 13:54:40 ++ rm /var/lib/jenkins/sccache_error.log 
Sep 29 13:54:40 rm: cannot remove '/var/lib/jenkins/sccache_error.log': No such file or directory 
Sep 29 13:54:40 ++ true 
Sep 29 13:54:40 ++ [[ pytorch-linux-xenial-cuda9.2-cudnn7-py3-gcc5.4-build == *rocm* ]] 
Sep 29 13:54:40 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 
Sep 29 13:54:40 ++ SCCACHE_IDLE_TIMEOUT=1200 
Sep 29 13:54:40 ++ RUST_LOG=sccache::server=error 
Sep 29 13:54:40 ++ sccache --start-server 
Sep 29 13:54:40 Starting sccache server... 

See CircleCI build pytorch_xla_linux_bionic_py3_6_clang9_build (3/6)

Step: "Set Up CI Environment After attach_workspace" (full log | diagnosis details | 🔁 rerun)

E: Unable to locate package expect-dev
E: Unable to locate package expect-dev 
+ sudo apt-get -y install moreutils expect-dev 
Reading package lists... 100%  Reading package lists... Done  
Building dependency tree... 50%  Building dependency tree         
Reading state information... 0%  Reading state information... Done  
Package moreutils is not available, but is referred to by another package. 
This may mean that the package is missing, has been obsoleted, or 
is only available from another source 
 
E: Package 'moreutils' has no installation candidate 
E: Unable to locate package expect-dev 

See CircleCI build binary_linux_libtorch_3_7m_cpu_gcc5_4_cxx11-abi_shared-with-deps_test (4/6)

Step: "Set Up CI Environment After attach_workspace" (full log | diagnosis details | 🔁 rerun)

E: Unable to locate package expect-dev
E: Unable to locate package expect-dev 
+ sudo apt-get -y install moreutils expect-dev 
Reading package lists... 100%  Reading package lists... Done  
Building dependency tree... 50%  Building dependency tree         
Reading state information... 0%  Reading state information... Done  
Package moreutils is not available, but is referred to by another package. 
This may mean that the package is missing, has been obsoleted, or 
is only available from another source 
 
E: Package 'moreutils' has no installation candidate 
E: Unable to locate package expect-dev 

See CircleCI build pytorch_bazel_test (5/6)

Step: "Set Up CI Environment After attach_workspace" (full log | diagnosis details | 🔁 rerun)

E: Unable to locate package expect-dev
E: Unable to locate package expect-dev 
+ sudo apt-get -y install moreutils expect-dev 
Reading package lists... 100%  Reading package lists... Done  
Building dependency tree... 50%  Building dependency tree         
Reading state information... 0%  Reading state information... Done  
Package moreutils is not available, but is referred to by another package. 
This may mean that the package is missing, has been obsoleted, or 
is only available from another source 
 
E: Package 'moreutils' has no installation candidate 
E: Unable to locate package expect-dev 

See CircleCI build pytorch_linux_backward_compatibility_check_test (6/6)

Step: "Set Up CI Environment After attach_workspace" (full log | diagnosis details | 🔁 rerun)

E: Unable to locate package expect-dev
E: Unable to locate package expect-dev 
+ sudo apt-get -y install moreutils expect-dev 
Reading package lists... 100%  Reading package lists... Done  
Building dependency tree... 50%  Building dependency tree         
Reading state information... 0%  Reading state information... Done  
Package moreutils is not available, but is referred to by another package. 
This may mean that the package is missing, has been obsoleted, or 
is only available from another source 
 
E: Package 'moreutils' has no installation candidate 
E: Unable to locate package expect-dev 

🚧 1 fixed upstream failure:

These were probably caused by upstream breakages that were already fixed.

Please rebase on the viable/strict branch (expand for instructions)

If your commit is newer than viable/strict, you can try basing on an older, stable commit:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase --onto FETCH_HEAD $(git merge-base origin/master HEAD)

If your commit is older than viable/strict:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase FETCH_HEAD

Check out the recency history of this "viable master" tracking branch.


ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 23 times.

@anjali411 anjali411 requested review from ezyang and zou3519 September 21, 2020 17:58
('addr', (S, M), ((S,), (M,)), 'coef', (), (), (), ident, {'beta': 0.2, 'alpha': 0.6}),
('addr', (), ((S,), (M,)), 'broadcast_lhs_coef', (), (), (), ident, {'beta': 0.2, 'alpha': 0.6}),
('dot', (L,), ((L,),), '', (True,)),
('vdot', (L,), ((L,),), '', (True,)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be (False,) at the end (to turn off JIT autodiff testing)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, ('vdot', (L,), ((L,),), does the trick

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc. @eellison torch.vdot is not supported by JIT autodiff right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question here is: Is it OK that autodiff doesn't support vdot? It's a new operator we added recently. Also, does the JIT still use the old autodiff pass?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be fine if it's not supported by JIT autodiff - having an autodiff is an optimization which allows fusion to occur in training.

Yea it still uses the old autodiff pass. The way to define a backwards is in the symbolic_script.cpp file. Currently, there is really only autodiff coverage for pointwise ops because those are the ops that we codegen fusion for.

…inition for torch.vdot"


TODO: Add R -> C tests in #44744 (blocked on some JIT changes)


[ghstack-poisoned]
anjali411 added a commit that referenced this pull request Sep 21, 2020
… torch.vdot

ghstack-source-id: 3462efa
Pull Request resolved: #45074
@codecov
Copy link

codecov bot commented Sep 21, 2020

Codecov Report

❗ No coverage uploaded for pull request base (gh/anjali411/57/base@5d1fee2). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@                   Coverage Diff                   @@
##             gh/anjali411/57/base   #45074   +/-   ##
=======================================================
  Coverage                        ?   67.85%           
=======================================================
  Files                           ?      384           
  Lines                           ?    50020           
  Branches                        ?        0           
=======================================================
  Hits                            ?    33940           
  Misses                          ?    16080           
  Partials                        ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5d1fee2...589c3e9. Read the comment docs.

Comment on lines 90 to 96
Tensor correct_dtype_gradients(ScalarType self_st, Tensor gradient_result) {
if (!at::isComplexType(self_st) && gradient_result.is_complex()) {
// R -> C
return at::real(gradient_result);
}
return gradient_result;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"correct_dtype_gradients" seems like a very generic name (e.g. it could be mistaken for a function that handles float -> double dtype conversion). Also, it looks like this function will be used quite a lot in autograd formulas.

Tossing some quick ideas out there:

  • "handle_r_to_c", "handle_real_to_complex"
  • "maybe_real_part"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that at::real should noop on real tensors, similar to how at::conj is noop on real tensors too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zou3519 makes sense.handle_real_to_complex should be clearer, I think.

@ezyang yeah I agree. The reason we disabled at::real for non-complex tensors before was because it would be weird to have real return a view for non-complex tensors and at::imag return a new tensor populated with zeros (which it was before). We can certainly only enable at::real for non-complex tensors if we want though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but now that at::conj is a no-op for real tensors, I think we should probably be OK with making at::real do the same as well. I don't care... too much about imag, I don't think it shows up in situations like this.

@ezyang
Copy link
Contributor

ezyang commented Sep 22, 2020

Besides Richard's comment, rest of the PR looks reasonable.

…inition for torch.vdot"


TODO: Add R -> C tests in #44744 (blocked on some JIT changes)


[ghstack-poisoned]
anjali411 added a commit that referenced this pull request Sep 28, 2020
… torch.vdot

ghstack-source-id: 9e4c16a
Pull Request resolved: #45074
@anjali411 anjali411 added complex_autograd module: complex Related to complex number support in PyTorch labels Sep 28, 2020
Copy link
Contributor

@zou3519 zou3519 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after the name change. The discussion on at::real being a no-op for real tensors seems orthogonal so I am approving to unblock.

…inition for torch.vdot"


TODO: Add R -> C tests in #44744 (blocked on some JIT changes)

Differential Revision: [D23975361](https://our.internmc.facebook.com/intern/diff/D23975361)

[ghstack-poisoned]
anjali411 added a commit that referenced this pull request Sep 29, 2020
… torch.vdot

ghstack-source-id: 7a82044
Pull Request resolved: #45074
@facebook-github-bot
Copy link
Contributor

@anjali411 merged this pull request in 18876b5.

@facebook-github-bot facebook-github-bot deleted the gh/anjali411/57/head branch October 3, 2020 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

complex_autograd module: complex Related to complex number support in PyTorch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants