-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[PyTorch] Add fused addmm path in linear for contiguous 3D input #72728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
If the input is 3D and contiguous, we can get a fused addmm by reshaping. Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/) [ghstack-poisoned]
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 693aad7 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
CI Flow Status⚛️ CI FlowRuleset - Version:
|
… input" If the input is 3D and contiguous, we can get a fused addmm by reshaping. Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/) [ghstack-poisoned]
|
Fixes this related #39661 ? (except |
Pull Request resolved: #72728 If the input is 3D and contiguous, we can get a fused addmm by reshaping. ghstack-source-id: 148949323 Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/)
… input" If the input is 3D and contiguous, we can get a fused addmm by reshaping. Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/) [ghstack-poisoned]
Pull Request resolved: #72728 If the input is 3D and contiguous, we can get a fused addmm by reshaping. ghstack-source-id: 149546045 Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/)
… input" If the input is 3D and contiguous, we can get a fused addmm by reshaping. Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/) [ghstack-poisoned]
Pull Request resolved: #72728 If the input is 3D and contiguous, we can get a fused addmm by reshaping. ghstack-source-id: 149656238 Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/)
… input" If the input is 3D and contiguous, we can get a fused addmm by reshaping. Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/) [ghstack-poisoned]
Pull Request resolved: #72728 If the input is 3D and contiguous, we can get a fused addmm by reshaping. ghstack-source-id: 149879778 Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/)
… input" If the input is 3D and contiguous, we can get a fused addmm by reshaping. Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/) [ghstack-poisoned]
Pull Request resolved: #72728 If the input is 3D and contiguous, we can get a fused addmm by reshaping. ghstack-source-id: 149980807 Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/)
… input" If the input is 3D and contiguous, we can get a fused addmm by reshaping. Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/) [ghstack-poisoned]
Pull Request resolved: #72728 If the input is 3D and contiguous, we can get a fused addmm by reshaping. ghstack-source-id: 150032112 Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/)
… input" If the input is 3D and contiguous, we can get a fused addmm by reshaping. Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/) [ghstack-poisoned]
Pull Request resolved: #72728 If the input is 3D and contiguous, we can get a fused addmm by reshaping. ghstack-source-id: 150236565 Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/)
… input" If the input is 3D and contiguous, we can get a fused addmm by reshaping. Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/) [ghstack-poisoned]
Pull Request resolved: #72728 If the input is 3D and contiguous, we can get a fused addmm by reshaping. ghstack-source-id: 150736402 Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/)
… input" If the input is 3D and contiguous, we can get a fused addmm by reshaping. Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/) [ghstack-poisoned]
Pull Request resolved: #72728 If the input is 3D and contiguous, we can get a fused addmm by reshaping. ghstack-source-id: 150813712 Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/)
… input" If the input is 3D and contiguous, we can get a fused addmm by reshaping. Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/) [ghstack-poisoned]
Pull Request resolved: #72728 If the input is 3D and contiguous, we can get a fused addmm by reshaping. ghstack-source-id: 152278479 Differential Revision: [D34176407](https://our.internmc.facebook.com/intern/diff/D34176407/)
) Summary: Pull Request resolved: #72728 If the input is 3D and contiguous, we can get a fused addmm by reshaping. ghstack-source-id: 152278479 Test Plan: existing tests? Reviewed By: zrphercule Differential Revision: D34176407 fbshipit-source-id: 899f216cadcd782c3b1b046025228df04228c740
|
Hey @swolchok. |
…puts (#92201) Fix for this issue surfaced from the discuss forum: https://discuss.pytorch.org/t/cuda-error-cublas-status-not-supported-when-calling-cublasltmatmul-from-torch-nn-functional-linear/170214 Note that PyTorch builds before #71200 should not be affected as there was no `cublasLt` dispatch path. Additionally, the provided repro has the quirk of using a 3D input, which means it will not dispatch to `cublasLt`-backed `addmm` until builds that include #72728. Changing the input to 2D by trivially removing the size `1` dimension will surface the failure on builds after #71200. Interestingly, the use-case where _all_ inputs are 2-byte aligned are supported (runs without crashing), but when some are > 2-byte and some are == 2-byte are not. This behavior suggests that the `cuBlastLt` heuristics are incorrect, as the heuristic function has visibility of the raw pointer values via the descriptors when it is called. We will follow up with `cuBlasLt` but this fix is needed to prevent unnecessary crashes for now. CC @ptrblck @ngimel Pull Request resolved: #92201 Approved by: https://github.com/ngimel
Stack from ghstack (oldest at bottom):
If the input is 3D and contiguous, we can get a fused addmm by reshaping.
Differential Revision: D34176407