-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Improve zero sized input for addmv #41824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💊 CI failures summary and remediationsAs of commit 9dd5bad (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 14 times. |
|
I don't think this is the right fix, because the root cause of that failing test case was fp16 gemm for fp16, not fp16 gemmv This fix will mask gemm failure because gemv for degenerate case won't go through gemm, but gemm failure will still remain. |
|
(Just realize that I put the wrong issue number. It is fixed now) @ngimel I think both addmm and addmv are wrong? And in the issue (#41340), the author says addmv: Traceback (most recent call last):
File "/tmp/easybuild-tmp/eb-1Ebm0K/tmpcR9xV8/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py", line 777, in wrapper
method(*args, **kwargs)
File "/tmp/easybuild-tmp/eb-1Ebm0K/tmpcR9xV8/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py", line 777, in wrapper
method(*args, **kwargs)
File "/tmp/easybuild-tmp/eb-1Ebm0K/tmpcR9xV8/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 241, in instantiated_test
result = test(self, device_arg, dtype)
File "/tmp/easybuild-tmp/eb-1Ebm0K/tmpcR9xV8/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 411, in dep_fn
return fn(slf, device, *args, **kwargs)
File "test_torch.py", line 13909, in test_blas_alpha_beta_empty
torch.addmv(input=input, mat=mat, vec=vec, alpha=alpha, beta=beta))
File "/tmp/easybuild-tmp/eb-1Ebm0K/tmpcR9xV8/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py", line 1080, in assertEqual
exact_device=exact_device)
File "/tmp/easybuild-tmp/eb-1Ebm0K/tmpcR9xV8/lib/python3.7/site-packages/torch/testing/_internal/common_utils.py", line 971, in _compareTensors
return _compare_tensors_internal(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan)
File "/tmp/easybuild-tmp/eb-1Ebm0K/tmpcR9xV8/lib/python3.7/site-packages/torch/testing/__init__.py", line 122, in _compare_tensors_internal
if torch.allclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan):
RuntimeError: CUDA error: an illegal memory access was encountered |
|
Yeah, the failing test is addmv. Addmv/mv call |
|
@ngimel now addmm is fixed too |
|
ping @ngimel |
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
Thank you for the fix! |
This reverts commit aef2890.
This reverts commit 25c6141.
This reverts commit aef2890.
fixes #41340
Unfortunately, I still can not get a K80 to verify the fix, but it should be working.