Skip to content

[CD] CUDA 13 specific followup changes#162455

Closed
tinglvv wants to merge 3 commits intopytorch:mainfrom
tinglvv:cu130-followup
Closed

[CD] CUDA 13 specific followup changes#162455
tinglvv wants to merge 3 commits intopytorch:mainfrom
tinglvv:cu130-followup

Conversation

@tinglvv
Copy link
Collaborator

@tinglvv tinglvv commented Sep 9, 2025

Follow up for CUDA 13 bring up #159779
sm50-70 should not be added to sbsa build arch list, as previous archs had no support for arm.
remove platform_machine from PYTORCH_EXTRA_INSTALL_REQUIREMENTS

cc @atalman @malfet @ptrblck @nWEIdia

@tinglvv tinglvv requested a review from a team as a code owner September 9, 2025 04:13
@pytorch-bot
Copy link

pytorch-bot bot commented Sep 9, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/162455

Note: Links to docs will display an error until the docs builds have been completed.

❌ 7 New Failures, 2 Cancelled Jobs, 51 Pending

As of commit 2c5c1b3 with merge base b5e6e58 (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOBS - The following jobs were cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: releng release notes category label Sep 9, 2025
@tinglvv tinglvv changed the title [CD] Rm sm50-70 for sbsa and remove platform_machine for PYTORCH_EXTRA_INSTALL_REQUIREMENTS [CD] CUDA 13 specific followup changes Sep 9, 2025
@tinglvv tinglvv added the ciflow/binaries Trigger all binary build and upload jobs on the PR label Sep 9, 2025
@bdhirsh bdhirsh added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Sep 9, 2025
@tinglvv
Copy link
Collaborator Author

tinglvv commented Sep 9, 2025

The failure is irrelevant

2025-09-09T22:47:36.4532587Z FAILED [0.1411s] dynamo/cpython/3_13/test_collections.py::TestNamedTuple::test_namedtuple_subclass_issue_24931 - RuntimeError: Unexpected success, please remove `test/dynamo_expected_failures/CPython313-test_collections-TestNamedTuple.test_namedtuple_subclass_issue_24931

@atalman
Copy link
Contributor

atalman commented Sep 9, 2025

@tinglvv for sm50-70 should not be added to sbsa build arch list, as previous archs had no support for arm. Do we have some kind of Nvidia doc specifying list of arches supported on aarch64 builds for different CUDA versions ? Please comment with details: #162455

@tinglvv
Copy link
Collaborator Author

tinglvv commented Sep 10, 2025

@tinglvv for sm50-70 should not be added to sbsa build arch list, as previous archs had no support for arm. Do we have some kind of Nvidia doc specifying list of arches supported on aarch64 builds for different CUDA versions ? Please comment with details: #162455

Sorry while there seems to be no original doc that specifies it, our testing for SBSA started from Ampere and there was no support for sm50-75 for SBSA wheels existed. Therefore we hope to keep the same support matrix for CUDA 12 as CUDA 13 to start from sm_80.

@tinglvv
Copy link
Collaborator Author

tinglvv commented Sep 10, 2025

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased cu130-followup onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout cu130-followup && git pull --rebase)

@atalman
Copy link
Contributor

atalman commented Sep 11, 2025

@pytorchmergebot merge -f "change looks good, merging"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
Follow up for CUDA 13 bring up pytorch#159779
sm50-70 should not be added to sbsa build arch list, as previous archs had no support for arm.
remove platform_machine from PYTORCH_EXTRA_INSTALL_REQUIREMENTS

Pull Request resolved: pytorch#162455
Approved by: https://github.com/atalman
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
Follow up for CUDA 13 bring up pytorch#159779
sm50-70 should not be added to sbsa build arch list, as previous archs had no support for arm.
remove platform_machine from PYTORCH_EXTRA_INSTALL_REQUIREMENTS

Pull Request resolved: pytorch#162455
Approved by: https://github.com/atalman
cleonard530 pushed a commit to cleonard530/pytorch that referenced this pull request Sep 22, 2025
Follow up for CUDA 13 bring up pytorch#159779
sm50-70 should not be added to sbsa build arch list, as previous archs had no support for arm.
remove platform_machine from PYTORCH_EXTRA_INSTALL_REQUIREMENTS

Pull Request resolved: pytorch#162455
Approved by: https://github.com/atalman
@atalman
Copy link
Contributor

atalman commented Sep 23, 2025

@pytorchbot cherry-pick --onto release/2.9 --fixes "Critical binary build fix" -c critical

@pytorchbot
Copy link
Collaborator

Cherry picking #162455

Command git -C /home/runner/work/pytorch/pytorch cherry-pick -x bb1d53bc47109c7c97e5fa072280d05b04e023e5 returned non-zero exit code 1

Auto-merging .ci/aarch64_linux/aarch64_ci_build.sh
Auto-merging .ci/aarch64_linux/aarch64_wheel_ci_build.py
Auto-merging .github/scripts/generate_binary_build_matrix.py
CONFLICT (content): Merge conflict in .github/scripts/generate_binary_build_matrix.py
Auto-merging .github/workflows/generated-linux-aarch64-binary-manywheel-nightly.yml
CONFLICT (content): Merge conflict in .github/workflows/generated-linux-aarch64-binary-manywheel-nightly.yml
Auto-merging .github/workflows/generated-linux-binary-manywheel-main.yml
CONFLICT (content): Merge conflict in .github/workflows/generated-linux-binary-manywheel-main.yml
Auto-merging .github/workflows/generated-linux-binary-manywheel-nightly.yml
CONFLICT (content): Merge conflict in .github/workflows/generated-linux-binary-manywheel-nightly.yml
error: could not apply bb1d53bc471... [CD] CUDA 13 specific followup changes (#162455)
hint: After resolving the conflicts, mark them with
hint: "git add/rm <pathspec>", then run
hint: "git cherry-pick --continue".
hint: You can instead skip this commit with "git cherry-pick --skip".
hint: To abort and get back to the state before "git cherry-pick",
hint: run "git cherry-pick --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Details for Dev Infra team Raised by workflow job

atalman pushed a commit to atalman/pytorch that referenced this pull request Sep 24, 2025
Follow up for CUDA 13 bring up pytorch#159779
sm50-70 should not be added to sbsa build arch list, as previous archs had no support for arm.
remove platform_machine from PYTORCH_EXTRA_INSTALL_REQUIREMENTS

Pull Request resolved: pytorch#162455
Approved by: https://github.com/atalman
atalman added a commit that referenced this pull request Sep 25, 2025
…From CUDA 12.6 and CUDA 12.8 builds (#162455) (#163764)

* [CD] CUDA 13 specific followup changes (#162455)

Follow up for CUDA 13 bring up #159779
sm50-70 should not be added to sbsa build arch list, as previous archs had no support for arm.
remove platform_machine from PYTORCH_EXTRA_INSTALL_REQUIREMENTS

Pull Request resolved: #162455
Approved by: https://github.com/atalman

* update

---------

Co-authored-by: Ting Lu <tingl@nvidia.com>
dsashidh pushed a commit to dsashidh/pytorch that referenced this pull request Sep 26, 2025
Follow up for CUDA 13 bring up pytorch#159779
sm50-70 should not be added to sbsa build arch list, as previous archs had no support for arm.
remove platform_machine from PYTORCH_EXTRA_INSTALL_REQUIREMENTS

Pull Request resolved: pytorch#162455
Approved by: https://github.com/atalman
huydhn added a commit to huydhn/pytorch that referenced this pull request Oct 17, 2025
Signed-off-by: Huy Do <huydhn@gmail.com>
pytorchmergebot pushed a commit that referenced this pull request Oct 18, 2025
When trying to bring cu129 back in #163029, I mainly looked at #163029 and missed another tweak coming from #162455

I discover this issue when testing aarch64+cu129 builds in https://github.com/pytorch/test-infra/actions/runs/18603342105/job/53046883322?pr=7373.  Surprisingly, there is no test running for aarch64 CUDA build from what I see in https://hud.pytorch.org/pytorch/pytorch/commit/79a37055e790482c12bf32e69b28c8e473d0209d.
Pull Request resolved: #165794
Approved by: https://github.com/malfet
pytorchbot pushed a commit that referenced this pull request Oct 18, 2025
When trying to bring cu129 back in #163029, I mainly looked at #163029 and missed another tweak coming from #162455

I discover this issue when testing aarch64+cu129 builds in https://github.com/pytorch/test-infra/actions/runs/18603342105/job/53046883322?pr=7373.  Surprisingly, there is no test running for aarch64 CUDA build from what I see in https://hud.pytorch.org/pytorch/pytorch/commit/79a37055e790482c12bf32e69b28c8e473d0209d.
Pull Request resolved: #165794
Approved by: https://github.com/malfet

(cherry picked from commit 9095a9d)
huydhn added a commit that referenced this pull request Oct 18, 2025
[CD] Apply the fix from #162455 to aarch64+cu129 build (#165794)

When trying to bring cu129 back in #163029, I mainly looked at #163029 and missed another tweak coming from #162455

I discover this issue when testing aarch64+cu129 builds in https://github.com/pytorch/test-infra/actions/runs/18603342105/job/53046883322?pr=7373.  Surprisingly, there is no test running for aarch64 CUDA build from what I see in https://hud.pytorch.org/pytorch/pytorch/commit/79a37055e790482c12bf32e69b28c8e473d0209d.
Pull Request resolved: #165794
Approved by: https://github.com/malfet

(cherry picked from commit 9095a9d)

Co-authored-by: Huy Do <huydhn@gmail.com>
Chao1Han pushed a commit to Chao1Han/pytorch that referenced this pull request Oct 21, 2025
…h#165794)

When trying to bring cu129 back in pytorch#163029, I mainly looked at pytorch#163029 and missed another tweak coming from pytorch#162455

I discover this issue when testing aarch64+cu129 builds in https://github.com/pytorch/test-infra/actions/runs/18603342105/job/53046883322?pr=7373.  Surprisingly, there is no test running for aarch64 CUDA build from what I see in https://hud.pytorch.org/pytorch/pytorch/commit/79a37055e790482c12bf32e69b28c8e473d0209d.
Pull Request resolved: pytorch#165794
Approved by: https://github.com/malfet
zhudada0120 pushed a commit to zhudada0120/pytorch that referenced this pull request Oct 22, 2025
…h#165794)

When trying to bring cu129 back in pytorch#163029, I mainly looked at pytorch#163029 and missed another tweak coming from pytorch#162455

I discover this issue when testing aarch64+cu129 builds in https://github.com/pytorch/test-infra/actions/runs/18603342105/job/53046883322?pr=7373.  Surprisingly, there is no test running for aarch64 CUDA build from what I see in https://hud.pytorch.org/pytorch/pytorch/commit/79a37055e790482c12bf32e69b28c8e473d0209d.
Pull Request resolved: pytorch#165794
Approved by: https://github.com/malfet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/binaries Trigger all binary build and upload jobs on the PR Merged open source release notes: releng release notes category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants