[ROCm][CI] Create periodic-rocm-mi200.yml#166544
[ROCm][CI] Create periodic-rocm-mi200.yml#166544amdfaa wants to merge 8 commits intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166544
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ⏳ No Failures, 13 PendingAs of commit 98717aa with merge base deb7763 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
Warning: Unknown label
Please add the new label to .github/pytorch-probot.yml |
Merge failedReason: New commits were pushed while merging. Please rerun the merge command. Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot merge -f "Force merging to relieve MI2xx queueing and provide separate workflow to target ROCm MI2xx distributed jobs" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
* We are separating out the rocm jobs of the periodic workflow * We are introducing a new label `ciflow/periodic-rocm-mi200` to allow us to run distributed tests only on ROCm runners, without triggering many other jobs on the `periodic.yml` workflow (via `ciflow/periodic`) * This new workflow will also be triggered via the `ciflow/periodic`, thus maintaining the old status quo. * We are reverting to the `linux.rocm.gpu.4` label since it targets a lot more CI nodes at this point than the K8s/ARC-based `linux.rocm.gpu.mi250.4` label, as that is still having some network/scaling issues. Pull Request resolved: #166544 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>
ciflow/periodic-rocm-mi200to allow us to run distributed tests only on ROCm runners, without triggering many other jobs on theperiodic.ymlworkflow (viaciflow/periodic)ciflow/periodic, thus maintaining the old status quo.linux.rocm.gpu.4label since it targets a lot more CI nodes at this point than the K8s/ARC-basedlinux.rocm.gpu.mi250.4label, as that is still having some network/scaling issues.cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang @naromero77amd