Build vLLM nightly wheels for CUDA 13.0#163239
Conversation
Signed-off-by: Huy Do <huydhn@gmail.com>
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/163239
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 63e37c3 with merge base a260163 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
Hmm... These segmentation faults are annoying: One noticeable fact is that they are all failing on |
|
Yeah, they are coming from compiling xformers https://github.com/facebookresearch/xformers/releases/tag/v0.0.32.post2 on aarch64. I don't know that the issue is about yet, so appreciate any thoughts you have in mind |
|
I have not encountered segfaults like that, but my first action would be decreasing |
Signed-off-by: Huy Do <huydhn@gmail.com>
|
Signed-off-by: Huy Do <huydhn@gmail.com>
|
This is currently blocked by a segfault on |
|
@pytorchbot rebase -b main |
|
@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here |
|
Rebase failed due to Command Raised by https://github.com/pytorch/pytorch/actions/runs/17938711036 |
@huydhn do we know if |
|
fixed: facebookresearch/xformers#1337 |
|
Thank @johnnynunez for the fix! And yes, xformers builds flash-attn |
|
@ptrblck @huydhn all PRs necessary for vllm cuda 13, were merged in public vllm(including flash-attention and blackwell family + cutlass v4.2.1), now only missing is facebookresearch/xformers#1337 I think that it is not merged yet because i was poiting to 2.9.0 and cuda 13.0 failing tests because not exists yet |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Rebase failed due to Command Raised by https://github.com/pytorch/pytorch/actions/runs/18439392278 |
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com>
|
I do not see |
| matrix: | ||
| platform: [ 'manylinux_2_28_x86_64', 'manylinux_2_28_aarch64' ] | ||
| device: [ 'cu128', 'cu129' ] | ||
| device: [ 'cu128', 'cu129', 'cu130' ] |
There was a problem hiding this comment.
Do we really care about cu128 here?
There was a problem hiding this comment.
Not really I think, this is just to be in sync with PyTorch. I will clean 12.8 up later once 2.9 is out and vLLM is updated to 2.9 + CUDA 12.9 officially
Yeah, I think I will circle back on this once 2.9 is out. The xformers FA build failure is still there, one less moving thing. Let me know if that makes sense to you |
now that index cu130 is public, they can run this: facebookresearch/xformers#1344 |
|
@pytorchbot merge -f 'vLLM builds are ok' |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Now that vllm-project/vllm#24599 has been merged Pull Request resolved: pytorch#163239 Approved by: https://github.com/malfet, https://github.com/atalman
Now that vllm-project/vllm#24599 has been merged