Skip to content

Conversation

@jithunnair-amd
Copy link
Collaborator

@jithunnair-amd jithunnair-amd commented Dec 11, 2024

Remove gfx900 and gfx906 archs as they're long-in-the-tooth. Should help reduce the increasing size of ROCm binaries.

cc @jeffdaily @sunway513 @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang @naromero77amd

@pytorch-bot
Copy link

pytorch-bot bot commented Dec 11, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/142827

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 1 New Failure, 3 Unrelated Failures

As of commit f6b347e with merge base 2b105de (image):

NEW FAILURE - The following job has failed:

UNSTABLE - The following jobs failed but were likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added ciflow/rocm Trigger "default" config CI on ROCm module: rocm AMD GPU support for Pytorch topic: not user facing topic category labels Dec 11, 2024
@jithunnair-amd jithunnair-amd marked this pull request as ready for review December 12, 2024 05:30
@jithunnair-amd
Copy link
Collaborator Author

Libtorch and manywheel docker images built successfully with PYTORCH_ROCM_ARCH not containing gfx900 or gfx906.

@jithunnair-amd
Copy link
Collaborator Author

@pytorchbot merge -f "Unrelated CI failures". ROCm manywheel/libtorch docker images built successfully"

@pytorch-bot
Copy link

pytorch-bot bot commented Dec 12, 2024

❌ 🤖 pytorchbot command failed:

Got EOF while in a quoted string```
Try `@pytorchbot --help` for more info.

@jithunnair-amd
Copy link
Collaborator Author

@pytorchbot merge -f "Unrelated CI failures. ROCm manywheel/libtorch docker images built successfully"

@jithunnair-amd jithunnair-amd changed the title [ROCm] Prune old gfx archs from binaries [ROCm] Prune old gfx archs gfx900/gfx906 from binaries Dec 12, 2024
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@EwoutH
Copy link

EwoutH commented Dec 13, 2024

I can’t say anything other than that I’m disappointed AMD doesn’t want to compete with Nvidia on software support.

@jeffdaily
Copy link
Collaborator

@pytorchbot revert

@pytorch-bot
Copy link

pytorch-bot bot commented Dec 13, 2024

❌ 🤖 pytorchbot command failed:

@pytorchbot revert: error: the following arguments are required: -m/--message, -c/--classification

usage: @pytorchbot revert -m MESSAGE -c
                          {nosignal,ignoredsignal,landrace,weird,ghfirst}

Try @pytorchbot --help for more info.

@jeffdaily
Copy link
Collaborator

@pytorchbot revert -m "prematurely dropped support for gfx900/gfx906" -c weird

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

@pytorchmergebot
Copy link
Collaborator

@jithunnair-amd your PR has been successfully reverted.

pytorchmergebot added a commit that referenced this pull request Dec 13, 2024
…)"

This reverts commit 1e2b841.

Reverted #142827 on behalf of https://github.com/jeffdaily due to prematurely dropped support for gfx900/gfx906 ([comment](#142827 (comment)))
@IMbackK
Copy link
Contributor

IMbackK commented Dec 18, 2024

@jeffdaily @jithunnair-amd maybe pursue using offload compression support added to llvm recently as an alternative?

@FlorianHeigl
Copy link

suggestion: generally try focus a bit on things other than forcibly reducing your user base.

@jeffdaily
Copy link
Collaborator

#143986 added --offload-compress to our builds to help reduce our binary size without removing gfx arches.

There is effort underway to support generic targets, as well.

@snarkyalyx
Copy link

snarkyalyx commented Sep 28, 2025

@jithunnair-amd Who cares about the increasing size of ROCm binaries? Have you listened to your users?

People want long-term support so that prosumer and consumer use can be fulfilled. The compute is there, 6700XT is a modern card, and you shouldn't be ending support so soon - especially since this is an argument for people to switch to NVIDIA, since they supported Kepler GPUs (released 2012) till September 2024.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td Do not run TD on this PR ciflow/rocm Trigger "default" config CI on ROCm Merged module: rocm AMD GPU support for Pytorch open source Reverted topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants