Build vLLM nightly wheels for CUDA 13.0 by huydhn · Pull Request #163239 · pytorch/pytorch

huydhn · 2025-09-18T03:43:12Z

Now that vllm-project/vllm#24599 has been merged

Signed-off-by: Huy Do <huydhn@gmail.com>

pytorch-bot · 2025-09-18T03:43:17Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/163239

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 63e37c3 with merge base a260163 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Aidyn-A · 2025-09-18T07:42:42Z

Hmm... These segmentation faults are annoying:

2025-09-18T03:56:20.8919447Z #21 260.0 sh: line 1:  1716 Segmentation fault      (core dumped) ptxas -arch=sm_90a -m64 -v --generate-line-info "/tmp/tmpxft_00000199_00000000-6_flash_fwd_hdim128_bf16_sm100.compute_90a.ptx" -o "/tmp/tmpxft_00000199_00000000-11_flash_fwd_hdim128_bf16_sm100.compute_90a.cubin" > /tmp/tmpxft_00000199_00000000-13_189d18d0_stdout 2> /tmp/tmpxft_00000199_00000000-13_189d18d0_stderr
...
2025-09-18T03:56:26.6301630Z #21 265.7 sh: line 1:  1766 Segmentation fault      (core dumped) ptxas -arch=sm_90a -m64 -v --generate-line-info "/tmp/tmpxft_0000019c_00000000-6_flash_fwd_hdim128_bf16_sm90.compute_90a.ptx" -o "/tmp/tmpxft_0000019c_00000000-11_flash_fwd_hdim128_bf16_sm90.compute_90a.cubin" > /tmp/tmpxft_0000019c_00000000-13_403da8f0_stdout 2> /tmp/tmpxft_0000019c_00000000-13_403da8f0_stderr
...
2025-09-18T03:56:35.8654592Z #21 275.0 sh: line 1:  1813 Segmentation fault      (core dumped) ptxas -arch=sm_90a -m64 -v --generate-line-info "/tmp/tmpxft_000001b1_00000000-6_flash_fwd_hdim128_fp16_sm90.compute_90a.ptx" -o "/tmp/tmpxft_000001b1_00000000-11_flash_fwd_hdim128_fp16_sm90.compute_90a.cubin" > /tmp/tmpxft_000001b1_00000000-13_3bfe3ad0_stdout 2> /tmp/tmpxft_000001b1_00000000-13_3bfe3ad0_stderr
...
2025-09-18T03:58:44.1437362Z #21 403.4 sh: line 1:  2262 Segmentation fault      (core dumped) ptxas -arch=sm_90a -m64 -v --generate-line-info "/tmp/tmpxft_00000600_00000000-6_flash_fwd_hdim192_128_bf16_sm90.compute_90a.ptx" -o "/tmp/tmpxft_00000600_00000000-11_flash_fwd_hdim192_128_bf16_sm90.compute_90a.cubin" > /tmp/tmpxft_00000600_00000000-13_344c4aa0_stdout 2> /tmp/tmpxft_00000600_00000000-13_344c4aa0_stderr
...
2025-09-18T03:58:53.9488721Z #21 413.2 sh: line 1:  2280 Segmentation fault      (core dumped) ptxas -arch=sm_90a -m64 -v --generate-line-info "/tmp/tmpxft_00000679_00000000-6_flash_fwd_hdim192_128_fp16_sm90.compute_90a.ptx" -o "/tmp/tmpxft_00000679_00000000-11_flash_fwd_hdim192_128_fp16_sm90.compute_90a.cubin" > /tmp/tmpxft_00000679_00000000-13_1bcf8520_stdout 2> /tmp/tmpxft_00000679_00000000-13_1bcf8520_stderr
...
2025-09-18T03:59:43.9530305Z #21 463.1 sh: line 1:  2325 Segmentation fault      (core dumped) ptxas -arch=sm_90a -m64 -v --generate-line-info "/tmp/tmpxft_00000769_00000000-6_flash_fwd_hdim192_bf16_sm90.compute_90a.ptx" -o "/tmp/tmpxft_00000769_00000000-11_flash_fwd_hdim192_bf16_sm90.compute_90a.cubin" > /tmp/tmpxft_00000769_00000000-13_2d628f0_stdout 2> /tmp/tmpxft_00000769_00000000-13_2d628f0_stderr

One noticeable fact is that they are all failing on sm_90a.

huydhn · 2025-09-18T08:04:28Z

Yeah, they are coming from compiling xformers https://github.com/facebookresearch/xformers/releases/tag/v0.0.32.post2 on aarch64. I don't know that the issue is about yet, so appreciate any thoughts you have in mind

Aidyn-A · 2025-09-18T09:27:24Z

I have not encountered segfaults like that, but my first action would be decreasing MAX_JOBS because those CUTLASS kernels are extremely compile-hungry.

Signed-off-by: Huy Do <huydhn@gmail.com>

huydhn · 2025-09-18T17:26:49Z

I have not encountered segfaults like that, but my first action would be decreasing MAX_JOBS because those CUTLASS kernels are extremely compile-hungry.

~~Ohh, you're spot on, it works after I lower MAX_JOBS~~ I spoke too soon, CI hasn't been run yet because of the merge conflicts, thus the green CI signals >_<

Signed-off-by: Huy Do <huydhn@gmail.com>

huydhn · 2025-09-20T00:08:40Z

This is currently blocked by a segfault on ptxas -arch=sm_90a that @Aidyn-A discovered. We have seen this only on aarch64, but x86 might be affected too. Maybe I could try my luck and skip aarch64 build for now

huydhn · 2025-09-23T07:21:52Z

@pytorchbot rebase -b main

pytorchmergebot · 2025-09-23T07:23:35Z

@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here

pytorchmergebot · 2025-09-23T07:23:36Z

Rebase failed due to Command git -C /home/runner/work/pytorch/pytorch rebase refs/remotes/origin/main pull/163239/head returned non-zero exit code 1

Rebasing (1/2)
Auto-merging .github/ci_commit_pins/vllm.txt
CONFLICT (content): Merge conflict in .github/ci_commit_pins/vllm.txt
error: could not apply 82df8a8a0ee... Build vLLM nightly wheels for CUDA 13.0
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Could not apply 82df8a8a0ee... # Build vLLM nightly wheels for CUDA 13.0

Raised by https://github.com/pytorch/pytorch/actions/runs/17938711036

ptrblck · 2025-09-24T15:30:13Z

Yeah, they are coming from compiling xformers...

@huydhn do we know if flash-attn is also built as part of xformers? If so, this fix might be needed: https://github.com/Dao-AILab/flash-attention/pull/1860/files

johnnynunez · 2025-09-26T18:54:29Z

fixed: facebookresearch/xformers#1337
cc @Aidyn-A

huydhn · 2025-09-26T20:07:25Z

Thank @johnnynunez for the fix! And yes, xformers builds flash-attn

johnnynunez · 2025-10-03T00:43:33Z

@ptrblck @huydhn all PRs necessary for vllm cuda 13, were merged in public vllm(including flash-attention and blackwell family + cutlass v4.2.1), now only missing is facebookresearch/xformers#1337 I think that it is not merged yet because i was poiting to 2.9.0 and cuda 13.0 failing tests because not exists yet

huydhn · 2025-10-12T04:56:07Z

@pytorchbot rebase

pytorchmergebot · 2025-10-12T04:57:40Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2025-10-12T04:57:43Z

Rebase failed due to Command git -C /home/runner/work/pytorch/pytorch rebase refs/remotes/origin/viable/strict pull/163239/head returned non-zero exit code 1

Rebasing (1/2)
Auto-merging .github/ci_commit_pins/vllm.txt
CONFLICT (content): Merge conflict in .github/ci_commit_pins/vllm.txt
error: could not apply 82df8a8a0ee... Build vLLM nightly wheels for CUDA 13.0
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Could not apply 82df8a8a0ee... # Build vLLM nightly wheels for CUDA 13.0

Raised by https://github.com/pytorch/pytorch/actions/runs/18439392278

Signed-off-by: Huy Do <huydhn@gmail.com>

Aidyn-A · 2025-10-13T07:15:01Z

I do not see Build vLLM wheels / Build cu130 vLLM wheel on manylinux_2_28_aarch64 in the CI, was it skipped?

malfet · 2025-10-15T01:22:19Z

.github/workflows/build-vllm-wheel.yml

      matrix:
        platform: [ 'manylinux_2_28_x86_64', 'manylinux_2_28_aarch64' ]
-        device: [ 'cu128', 'cu129' ]
+        device: [ 'cu128', 'cu129', 'cu130' ]


Do we really care about cu128 here?

Not really I think, this is just to be in sync with PyTorch. I will clean 12.8 up later once 2.9 is out and vLLM is updated to 2.9 + CUDA 12.9 officially

huydhn · 2025-10-15T01:35:37Z

I do not see Build vLLM wheels / Build cu130 vLLM wheel on manylinux_2_28_aarch64 in the CI, was it skipped?

Yeah, I think I will circle back on this once 2.9 is out. The xformers FA build failure is still there, one less moving thing. Let me know if that makes sense to you

johnnynunez · 2025-10-15T02:03:22Z

I do not see Build vLLM wheels / Build cu130 vLLM wheel on manylinux_2_28_aarch64 in the CI, was it skipped?

Yeah, I think I will circle back on this once 2.9 is out. The xformers FA build failure is still there, one less moving thing. Let me know if that makes sense to you

now that index cu130 is public, they can run this: facebookresearch/xformers#1344

huydhn · 2025-10-16T01:01:19Z

@pytorchbot merge -f 'vLLM builds are ok'

pytorchmergebot · 2025-10-16T01:02:59Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Now that vllm-project/vllm#24599 has been merged Pull Request resolved: pytorch#163239 Approved by: https://github.com/malfet, https://github.com/atalman

Build vLLM nightly wheels for CUDA 13.0

82df8a8

Signed-off-by: Huy Do <huydhn@gmail.com>

huydhn added the test-config/default label Sep 18, 2025

pytorch-bot bot added ciflow/inductor topic: not user facing topic category labels Sep 18, 2025

Try to lower max_jobs

1c13887

Signed-off-by: Huy Do <huydhn@gmail.com>

huydhn added the ciflow/vllm label Sep 18, 2025

Merge branch 'main' into vllm-wheel-cuda13

fc0614d

Signed-off-by: Huy Do <huydhn@gmail.com>

huydhn added 3 commits October 11, 2025 22:07

Merge branch 'main' into vllm-wheel-cuda13

8b07999

Leave cu130 aarch64 build for later

aa76098

Signed-off-by: Huy Do <huydhn@gmail.com>

Add more context to comment

72e1bea

Signed-off-by: Huy Do <huydhn@gmail.com>

huydhn requested a review from atalman October 12, 2025 06:52

huydhn marked this pull request as ready for review October 12, 2025 06:53

huydhn requested a review from a team as a code owner October 12, 2025 06:53

Wrong syntax or smth?

6cb9014

Signed-off-by: Huy Do <huydhn@gmail.com>

huydhn added 7 commits October 12, 2025 11:27

Use exclude

fdb2de1

Signed-off-by: Huy Do <huydhn@gmail.com>

Minor tweak

13a9c78

Signed-off-by: Huy Do <huydhn@gmail.com>

Merge branch 'main' into vllm-wheel-cuda13

9928b4a

Signed-off-by: Huy Do <huydhn@gmail.com>

The exclude syntax is weird

6858b63

Signed-off-by: Huy Do <huydhn@gmail.com>

Try again

a57894c

Signed-off-by: Huy Do <huydhn@gmail.com>

Ok now

43038e0

Signed-off-by: Huy Do <huydhn@gmail.com>

Ok dokey

63e37c3

Signed-off-by: Huy Do <huydhn@gmail.com>

huydhn mentioned this pull request Oct 13, 2025

Proposal to bring back 12.9 wheels #165165

Closed

malfet approved these changes Oct 15, 2025

View reviewed changes

atalman approved these changes Oct 15, 2025

View reviewed changes

pytorchmergebot added the merging label Oct 16, 2025

pytorchmergebot closed this in c2bd41a Oct 16, 2025

pytorchmergebot added Merged and removed merging labels Oct 16, 2025

atalman mentioned this pull request Oct 29, 2025

[RFC]: Coordinating vLLM and PyTorch Release Timelines. Starting with PyTorch Release 2.10 vllm-project/vllm#27767

Open

1 task

Conversation

huydhn commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/163239

✅ No Failures

Uh oh!

Aidyn-A commented Sep 18, 2025

Uh oh!

huydhn commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Aidyn-A commented Sep 18, 2025

Uh oh!

huydhn commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

huydhn commented Sep 20, 2025

Uh oh!

huydhn commented Sep 23, 2025

Uh oh!

pytorchmergebot commented Sep 23, 2025

Uh oh!

pytorchmergebot commented Sep 23, 2025

Uh oh!

ptrblck commented Sep 24, 2025

Uh oh!

johnnynunez commented Sep 26, 2025

Uh oh!

huydhn commented Sep 26, 2025

Uh oh!

johnnynunez commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

huydhn commented Oct 12, 2025

Uh oh!

pytorchmergebot commented Oct 12, 2025

Uh oh!

pytorchmergebot commented Oct 12, 2025

Uh oh!

Aidyn-A commented Oct 13, 2025

Uh oh!

malfet Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

huydhn Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

huydhn commented Oct 15, 2025

Uh oh!

johnnynunez commented Oct 15, 2025

Uh oh!

huydhn commented Oct 16, 2025

Uh oh!

pytorchmergebot commented Oct 16, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

huydhn commented Sep 18, 2025 •

edited

Loading

pytorch-bot bot commented Sep 18, 2025 •

edited

Loading

huydhn commented Sep 18, 2025 •

edited

Loading

huydhn commented Sep 18, 2025 •

edited

Loading

johnnynunez commented Oct 3, 2025 •

edited

Loading

huydhn Oct 15, 2025 •

edited

Loading