Skip to content

Conversation

@ngimel
Copy link
Collaborator

@ngimel ngimel commented Sep 15, 2020

per title. If beta=0 and slow path was taken, nan and inf in the result were not masked as is the case with other linear algebra functions. Similarly, since mv is implemented as addmv with beta=0, wrong results were sometimes produced for mv slow path.

v = torch.randn(100, device=device).to(dtype)
self._test_addmm_addmv(torch.addmv, t, m, v, beta=0)

# Test beta=0, v=nan, 0-strided v
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Test beta=0, v=nan, 0-strided v
# Test beta=0, t=nan, 0-strided v

@ngimel
Copy link
Collaborator Author

ngimel commented Sep 15, 2020

Tests expanded according to @zasdfgbnm suggestion (thanks!), initialization for broadcasted matrices/vectors in tests fixed to remain within exactly representable numbers and reduce errors

@codecov
Copy link

codecov bot commented Sep 15, 2020

Codecov Report

Merging #44681 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #44681   +/-   ##
=======================================
  Coverage   67.97%   67.97%           
=======================================
  Files         384      384           
  Lines       49626    49626           
=======================================
  Hits        33731    33731           
  Misses      15895    15895           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ace81b6...324f2a9. Read the comment docs.

Copy link
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can have nice things after all.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@dr-ci
Copy link

dr-ci bot commented Sep 15, 2020

💊 CI failures summary and remediations

As of commit 324f2a9 (more details on the Dr. CI page):



🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_test (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Sep 15 19:02:03 [E request_callback_no_python.cpp:618] Received error while processing request type 2: RuntimeError: Can not pickle torch.futures.Future
Sep 15 19:02:03 At: 
Sep 15 19:02:03   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(93): serialize 
Sep 15 19:02:03   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(145): serialize 
Sep 15 19:02:03  
Sep 15 19:02:03 [E request_callback_no_python.cpp:618] Received error while processing request type 2: RuntimeError: Can not pickle torch.futures.Future 
Sep 15 19:02:03  
Sep 15 19:02:03 At: 
Sep 15 19:02:03   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(93): serialize 
Sep 15 19:02:03   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(145): serialize 
Sep 15 19:02:03  
Sep 15 19:02:03 [E request_callback_no_python.cpp:618] Received error while processing request type 2: RuntimeError: Can not pickle torch.futures.Future 
Sep 15 19:02:03  
Sep 15 19:02:03 At: 
Sep 15 19:02:03   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(93): serialize 
Sep 15 19:02:03   /opt/conda/lib/python3.6/site-packages/torch/distributed/rpc/internal.py(145): serialize 
Sep 15 19:02:03  
Sep 15 19:02:03 [W tensorpipe_agent.cpp:576] RPC agent for worker0 encountered error when reading incoming request from worker3: EOF: end of file (this is expected to happen during shutdown) 
Sep 15 19:02:04 ok (1.636s) 
Sep 15 19:02:05   test_return_future_remote (__main__.TensorPipeRpcTestWithSpawn) ... ok (1.637s) 
Sep 15 19:02:07   test_return_local_rrefs (__main__.TensorPipeRpcTestWithSpawn) ... [W tensorpipe_agent.cpp:576] RPC agent for worker1 encountered error when reading incoming request from worker2: EOF: end of file (this is expected to happen during shutdown) 
Sep 15 19:02:07 [W tensorpipe_agent.cpp:576] RPC agent for worker0 encountered error when reading incoming request from worker2: EOF: end of file (this is expected to happen during shutdown) 

❄️ 1 failure tentatively classified as flaky

but reruns have not yet been triggered to confirm:

See CircleCI build binary_windows_libtorch_3_7_cpu_release_build (1/1)

Step: "Checkout code" (full log | diagnosis details | 🔁 rerun) ❄️

Writing SSH key for checkout to id_rsa
Creating .ssh directory
Adding the following entries to known_hosts:
github.com ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAq2A7hRGmdnm9tUDbO9IDSwBK6TbQa+PXYPCPy6rbTrTtw7PHkccKrpp0yVhp5HdEIcKr6pLlVDBfOLX9QUsyCOV0wzfjIJNlGEYsdlLJizHhbn2mUjvSAHQqZETYP81eFzLQNnPHt4EVVUh7VfDESU84KezmD5QlWpXLmvU31/yMf+Se8xhHTvKSCZIFImWwoG6mbUoWf9nzpIoaSjB+weqqUUmpaaasXVal72J+UX2B+2RPW3RcT0eOzQgqlJL3RKrTJvdsjE3JEAvGq3lGHSZXy28G3skua2SmVi/w4yCE6gbODqnTWlg7+wC604ydGXA8VJiS5ap43JXiUFFAaQ==
bitbucket.org ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAubiN81eDcafrgMeLzaFPsw2kNvEcqTKl/VqLat/MaB33pZy0y3rJZtnqwR2qOOvbwKZYKiEO1O6VqNEBxKvJJelCq0dTXWT5pbO2gDXC6h6QDXCaHo6pOHGPUy+YBaGQRGuSusMEASYiWunYN0vCAI8QaXnWMXNMdFP3jHAJH0eDsoiGnLPBlBp4TNm6rYI74nMzgz3B9IikW4WVK+dc8KZJZWYjAuORU3jc1c/NPskD2ASinf8v3xnfXeukU0sJ5N6m5E8VLjObPEO+mN2t/FZTMZLiFqPWc/ALSqnMnnhwrNi2rbfg/rd/IpL8Le3pSBne8+seeFVBoGqzHM9yXw==

Writing SSH key for checkout to id_rsa

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 2 times.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ngimel merged this pull request in e6101f5.

xuzhao9 pushed a commit that referenced this pull request Sep 18, 2020
…v slow path (#44681)

Summary:
per title. If `beta=0` and slow path was taken, `nan` and `inf` in the result were not masked as is the case with other linear algebra functions. Similarly, since `mv` is implemented as `addmv` with `beta=0`, wrong results were sometimes produced for `mv` slow path.

Pull Request resolved: #44681

Reviewed By: mruberry

Differential Revision: D23708653

Pulled By: ngimel

fbshipit-source-id: e2d5d3e6f69b194eb29b327e1c6f70035f3b231c
@ngimel ngimel deleted the lda branch September 30, 2020 04:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants