[ATen][CUDA] CUTLASS matmuls: add sm_103a flag #162956
[ATen][CUDA] CUTLASS matmuls: add sm_103a flag #162956Aidyn-A wants to merge 2 commits intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/162956
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 642b74f with merge base 814ba34 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: linux-binary-manywheel / manywheel-py3_12-cuda12_8-build / build Details for Dev Infra teamRaised by workflow job |
eqy
left a comment
There was a problem hiding this comment.
Looks like it needs to be disabled for CUDA 12.8 and earlier
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
This PR adds an `sm_103a` flag for GroupMM and RowwiseScaledMM. Contrary to just pytorch#161399, this simply adds the flag as the support for `sm_103a` matmuls is going to be added to CUTLASS v4.2 (see pytorch#161399 (comment)). Pull Request resolved: pytorch#162956 Approved by: https://github.com/eqy, https://github.com/Skylion007
This PR adds an `sm_103a` flag for GroupMM and RowwiseScaledMM. Contrary to just pytorch#161399, this simply adds the flag as the support for `sm_103a` matmuls is going to be added to CUTLASS v4.2 (see pytorch#161399 (comment)). Pull Request resolved: pytorch#162956 Approved by: https://github.com/eqy, https://github.com/Skylion007
This PR adds an `sm_103a` flag for GroupMM and RowwiseScaledMM. Contrary to just pytorch#161399, this simply adds the flag as the support for `sm_103a` matmuls is going to be added to CUTLASS v4.2 (see pytorch#161399 (comment)). Pull Request resolved: pytorch#162956 Approved by: https://github.com/eqy, https://github.com/Skylion007
This PR adds an `sm_103a` flag for GroupMM and RowwiseScaledMM. Contrary to just pytorch#161399, this simply adds the flag as the support for `sm_103a` matmuls is going to be added to CUTLASS v4.2 (see pytorch#161399 (comment)). Pull Request resolved: pytorch#162956 Approved by: https://github.com/eqy, https://github.com/Skylion007
This PR adds an
sm_103aflag for GroupMM and RowwiseScaledMM. Contrary to just #161399, this simply adds the flag as the support forsm_103amatmuls is going to be added to CUTLASS v4.2 (see #161399 (comment)).cc @ptrblck @msaroufim @eqy @jerryzh168 @manuelcandales @SherlockNoMad @angelayi