Skip to content

[CI] Use 1-GPU runners for rocm-mi355.yml#165658

Closed
jithunnair-amd wants to merge 2 commits intomainfrom
jithunnair-amd-patch-3
Closed

[CI] Use 1-GPU runners for rocm-mi355.yml#165658
jithunnair-amd wants to merge 2 commits intomainfrom
jithunnair-amd-patch-3

Conversation

@jithunnair-amd
Copy link
Collaborator

@jithunnair-amd jithunnair-amd commented Oct 16, 2025

Should only need 1-GPU runners for rocm-mi355.yml since it runs default test config which only needs 1 GPU

cc @jeffdaily @sunway513 @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang @naromero77amd

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 16, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/165658

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 23 Pending

As of commit 2d3a0fd with merge base e1d71a6 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added ciflow/rocm Trigger "default" config CI on ROCm module: rocm AMD GPU support for Pytorch topic: not user facing topic category labels Oct 16, 2025
@jithunnair-amd
Copy link
Collaborator Author

@saienduri Can you please confirm that we have the linux.rocm.gpu.mi355.1 runner scaleset and label available?

@jithunnair-amd jithunnair-amd changed the title Use 1-GPU runners for rocm-mi355.yml [CI] Use 1-GPU runners for rocm-mi355.yml Oct 16, 2025
@jithunnair-amd jithunnair-amd added the ciflow/rocm-mi355 Trigger "default" config CI on ROCm MI355 runners label Oct 16, 2025
@jithunnair-amd
Copy link
Collaborator Author

Since this PR is just changing to 1-GPU runners, like we have already done for MI300 workflows, we just need to ensure that the linux.rocm.gpu.mi355.1 label works: https://github.com/pytorch/pytorch/actions/runs/18574262015/job/52958336975?pr=165658

@jithunnair-amd
Copy link
Collaborator Author

@pytorchbot merge -f "Jobs scheduling successfully on single-GPU MI355 runners. No need to run to completion"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Chao1Han pushed a commit to Chao1Han/pytorch that referenced this pull request Oct 21, 2025
Should only need 1-GPU runners for rocm-mi355.yml since it runs `default` test config which only needs 1 GPU

Pull Request resolved: pytorch#165658
Approved by: https://github.com/jeffdaily
zhudada0120 pushed a commit to zhudada0120/pytorch that referenced this pull request Oct 22, 2025
Should only need 1-GPU runners for rocm-mi355.yml since it runs `default` test config which only needs 1 GPU

Pull Request resolved: pytorch#165658
Approved by: https://github.com/jeffdaily
@github-actions github-actions bot deleted the jithunnair-amd-patch-3 branch November 16, 2025 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/rocm Trigger "default" config CI on ROCm ciflow/rocm-mi355 Trigger "default" config CI on ROCm MI355 runners Merged module: rocm AMD GPU support for Pytorch open source topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants