Skip to content

[ROCm][CI] Add support for gfx1100 in rocm workflow + test skips#148355

Closed
amdfaa wants to merge 12 commits intopytorch:mainfrom
amdfaa:patch-13
Closed

[ROCm][CI] Add support for gfx1100 in rocm workflow + test skips#148355
amdfaa wants to merge 12 commits intopytorch:mainfrom
amdfaa:patch-13

Conversation

@amdfaa
Copy link
Contributor

@amdfaa amdfaa commented Mar 3, 2025

@pytorch-bot
Copy link

pytorch-bot bot commented Mar 3, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/148355

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 1 Cancelled Job

As of commit 641fe85 with merge base 3912ba3 (image):

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added module: rocm AMD GPU support for Pytorch topic: not user facing topic category labels Mar 3, 2025
@jithunnair-amd jithunnair-amd added keep-going Don't stop on first failure, keep running tests until the end ciflow/rocm Trigger "default" config CI on ROCm labels Mar 4, 2025
@pytorch-bot pytorch-bot bot removed the ciflow/rocm Trigger "default" config CI on ROCm label Mar 5, 2025
@jithunnair-amd jithunnair-amd added the ciflow/rocm Trigger "default" config CI on ROCm label Mar 5, 2025
@pytorch-bot pytorch-bot bot removed the ciflow/rocm Trigger "default" config CI on ROCm label Mar 5, 2025
@jithunnair-amd jithunnair-amd added the ciflow/rocm Trigger "default" config CI on ROCm label Mar 6, 2025
@pytorch-bot pytorch-bot bot removed the ciflow/rocm Trigger "default" config CI on ROCm label Mar 11, 2025
@jithunnair-amd jithunnair-amd added the ciflow/rocm Trigger "default" config CI on ROCm label Mar 11, 2025
@pytorch-bot pytorch-bot bot removed the ciflow/rocm Trigger "default" config CI on ROCm label Mar 17, 2025
@jithunnair-amd jithunnair-amd added the ciflow/rocm Trigger "default" config CI on ROCm label Mar 18, 2025
@amdfaa amdfaa changed the title [DO NOT MERGE] Test new ROCm CI Navi31 nodes Add support for gfx1100 in rocm workflow Mar 18, 2025
@amdfaa amdfaa marked this pull request as ready for review March 18, 2025 14:58
@amdfaa amdfaa requested review from a team and jeffdaily as code owners March 18, 2025 14:58
@pytorch-bot pytorch-bot bot added module: inductor and removed ciflow/rocm Trigger "default" config CI on ROCm labels Mar 18, 2025
@amdfaa amdfaa changed the title Add support for gfx1100 in rocm workflow Add support for gfx1100 in rocm workflow + test skips Mar 18, 2025
@jithunnair-amd jithunnair-amd added the ciflow/rocm Trigger "default" config CI on ROCm label Mar 19, 2025
Copy link
Collaborator

@jeffdaily jeffdaily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question but otherwise LGTM.

@jeffdaily jeffdaily changed the title Add support for gfx1100 in rocm workflow + test skips [ROCm] Add support for gfx1100 in rocm workflow + test skips Mar 19, 2025
@pytorch-bot pytorch-bot bot removed the ciflow/rocm Trigger "default" config CI on ROCm label Jul 14, 2025
amdfaa added 2 commits August 27, 2025 19:12
Update rocm.yml

Update trunk.yml

Update build.sh

Update action.yml

Update trunk.yml

Update rocm.yml

Update action.yml

Update action.yml to output 0 even if it doesn't find the search

skip several tests on navi for upstream

Change gfx1100 check to use equal to 0

reduce set of tests to run on Navi31

Syntax

refactor names

getRocmVersion

fix lint and syntax errors

Lint

Remove torchinductor_opinfo

trigger rebuild

trigger rebuild

trigger rebuild

Update rocm.yml to new naming

Update job name in rocm.yml

Update job name in rocm.yml
@jithunnair-amd jithunnair-amd added ci-no-td Do not run TD on this PR ciflow/rocm Trigger "default" config CI on ROCm labels Aug 27, 2025
@pytorch-bot pytorch-bot bot removed the ciflow/rocm Trigger "default" config CI on ROCm label Aug 28, 2025
@jeffdaily jeffdaily marked this pull request as ready for review October 7, 2025 17:07
@jeffdaily jeffdaily added the ciflow/rocm Trigger "default" config CI on ROCm label Oct 7, 2025
@jeffdaily
Copy link
Collaborator

@pytorchbot merge -f "adds new rocm CI flow for gfx1100; post-merge only"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Chao1Han pushed a commit to Chao1Han/pytorch that referenced this pull request Oct 21, 2025
…orch#148355)

This PR adds infrastructure support for gfx1100 in the rocm workflow. Nodes have been allocated for this effort.
@dnikolaev-amd contributed all the test skips.

Pull Request resolved: pytorch#148355
Approved by: https://github.com/jeffdaily

Co-authored-by: Dmitry Nikolaev <dmitry.nikolaev@amd.com>
Co-authored-by: Jeff Daily <jeff.daily@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td Do not run TD on this PR ciflow/rocm Trigger "default" config CI on ROCm keep-going Don't stop on first failure, keep running tests until the end Merged module: inductor module: rocm AMD GPU support for Pytorch open source topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants