Skip to content

[ROCm] rocblas Aten GEMM overload for FP32 output from FP16/BF16 inputs#162600

Closed
jagadish-amd wants to merge 1 commit intopytorch:mainfrom
jagadish-amd:enable_mm_overload_rocblas
Closed

[ROCm] rocblas Aten GEMM overload for FP32 output from FP16/BF16 inputs#162600
jagadish-amd wants to merge 1 commit intopytorch:mainfrom
jagadish-amd:enable_mm_overload_rocblas

Conversation

@jagadish-amd
Copy link
Contributor

@jagadish-amd jagadish-amd commented Sep 10, 2025

Fix ROCm GEMM helper to set output type (C/D) based on C_Dtype template parameter.

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang @naromero77amd

Fix ROCm GEMM helper to set output type (C/D) based on
C_Dtype template parameter.

Signed-off-by: Jagadish Krishnamoorthy <jagadish.krishnamoorthy@amd.com>
@pytorch-bot
Copy link

pytorch-bot bot commented Sep 10, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/162600

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Unrelated Failure

As of commit e25ccce with merge base 484c409 (image):

NEW FAILURE - The following job has failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the module: rocm AMD GPU support for Pytorch label Sep 10, 2025
@jagadish-amd
Copy link
Contributor Author

@pytorchbot label "topic: not user facing"

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Sep 10, 2025
@jeffdaily jeffdaily changed the title ROCm: Enable overload tests for rocblas backend. [ROCm] Enable overload tests for rocblas backend Sep 10, 2025
@jeffdaily jeffdaily changed the title [ROCm] Enable overload tests for rocblas backend [ROCm] rocblas Aten GEMM overload for FP32 output from FP16/BF16 inputs Sep 10, 2025
@pytorch-bot pytorch-bot bot added the ciflow/rocm Trigger "default" config CI on ROCm label Sep 10, 2025
@jeffdaily jeffdaily added release notes: rocm mandatorylabel ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 and removed topic: not user facing topic category labels Sep 10, 2025
@pytorch-bot

This comment was marked as outdated.

@pytorch-bot pytorch-bot bot removed the ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 label Sep 10, 2025
@jeffdaily jeffdaily added the ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 label Sep 10, 2025
@jeffdaily jeffdaily marked this pull request as ready for review September 10, 2025 16:01
@jagadish-amd
Copy link
Contributor Author

all tests related to test_mm_bmm_dtype_overload and test_addmm_baddmm_dtype_overload are passing.
https://github.com/pytorch/pytorch/actions/runs/17619150768/job/50062792389?pr=162600#logs

cc @jeffdaily

@jagadish-amd
Copy link
Contributor Author

rocm-mi300 / linux-noble-rocm-py3.12-mi300 / test (default, 6, 6, linux.rocm.gpu.gfx942.1) (gh)
'test/inductor/test_cudagraph_trees.py::CudaGraphTreeTests::test_alias_of_parameter'

This test passed locally, test_alias_of_parameter (main.CudaGraphTreeTests) OK

@jeffdaily jeffdaily added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 10, 2025
@jithunnair-amd
Copy link
Collaborator

@pytorchbot merge -f "Unrelated failures in rocm-mi300; trunk passed"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
…ts (pytorch#162600)

Fix ROCm GEMM helper to set output type (C/D) based on C_Dtype template parameter.

Pull Request resolved: pytorch#162600
Approved by: https://github.com/jeffdaily, https://github.com/pruthvistony
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
…ts (pytorch#162600)

Fix ROCm GEMM helper to set output type (C/D) based on C_Dtype template parameter.

Pull Request resolved: pytorch#162600
Approved by: https://github.com/jeffdaily, https://github.com/pruthvistony
cleonard530 pushed a commit to cleonard530/pytorch that referenced this pull request Sep 22, 2025
…ts (pytorch#162600)

Fix ROCm GEMM helper to set output type (C/D) based on C_Dtype template parameter.

Pull Request resolved: pytorch#162600
Approved by: https://github.com/jeffdaily, https://github.com/pruthvistony
dsashidh pushed a commit to dsashidh/pytorch that referenced this pull request Sep 26, 2025
…ts (pytorch#162600)

Fix ROCm GEMM helper to set output type (C/D) based on C_Dtype template parameter.

Pull Request resolved: pytorch#162600
Approved by: https://github.com/jeffdaily, https://github.com/pruthvistony
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/rocm Trigger "default" config CI on ROCm ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 ciflow/trunk Trigger trunk jobs on your pull request Merged module: rocm AMD GPU support for Pytorch open source release notes: rocm mandatorylabel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants