-
-
Notifications
You must be signed in to change notification settings - Fork 16.9k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add DeepSeek-V4 XPU support with FP8 KV cache
deepseek
Related to DeepSeek models
intel-gpu
Related to Intel GPU
rocm
Related to AMD ROCm
v1
#42919
opened May 18, 2026 by
majian4work
Contributor
Loading…
add svg
documentation
Improvements or additions to documentation
#42918
opened May 18, 2026 by
gracie-guo
Loading…
4 tasks
docs: clarify CUDA nightly wheel index priority
documentation
Improvements or additions to documentation
nvidia
#42917
opened May 18, 2026 by
AmanPandey28
Loading…
4 tasks
[Attn][Triton] Support FP8 q_scale/k_scale/v_scale in per-tensor attention path
v1
#42916
opened May 18, 2026 by
yiliu30
Contributor
Loading…
[MoE][Perf] Replace torch.compile pack with fused Triton kernels for FlashInfer routed MoE
nvidia
#42914
opened May 18, 2026 by
Darcy-Lee
Loading…
4 tasks
Revert "[torch.compile] Add patch for fullgraph compilation" (#42686)
#42913
opened May 18, 2026 by
vllm-agent
•
Draft
[ROCm][CI] Stabilize ROCm pooling and multimodal CI
multi-modality
Related to multi-modality (#4194)
qwen
Related to Qwen models
ready
ONLY add when PR is ready to merge/full CI is needed
rocm
Related to AMD ROCm
#42909
opened May 18, 2026 by
AndreasKaratzas
Collaborator
Loading…
[MRV2][CI] Add update_config method for V2 Runner
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#42907
opened May 18, 2026 by
jikunshang
Collaborator
Loading…
4 tasks
[Bugfix] Warn when renderer_num_workers has no effect on offline LLM
bug
Something isn't working
frontend
#42905
opened May 18, 2026 by
DaoyuanLi2816
Loading…
[feature] [xgrammar] support
patternProperties and propertyNames kw for object types
ci/build
structured-output
v1
#42904
opened May 18, 2026 by
cjackal
Contributor
Loading…
3 of 4 tasks
RISC-V ILP Optimization: Add instruction-level parallelism for transcendental functions
cpu
Related to CPU backends
documentation
Improvements or additions to documentation
performance
Performance-related issues
#42900
opened May 17, 2026 by
mohankku
Contributor
Loading…
Codec: token-native binary transport for /v1/completions + /v1/chat/completions streaming
frontend
#42896
opened May 17, 2026 by
wdunn001
Loading…
4 tasks
Support Nomic-embed-text-v1 with transformers v5
#42894
opened May 17, 2026 by
ieBoytsov
Contributor
Loading…
2 of 4 tasks
[KV Events] Switch event structs from array to map encoding
documentation
Improvements or additions to documentation
#42892
opened May 17, 2026 by
sagearc
Contributor
Loading…
Remove Pydantic v2.11 workaround: simplify Mistral tokenizer tool call handling
cpu
Related to CPU backends
documentation
Improvements or additions to documentation
frontend
mistral
Related to Mistral models
performance
Performance-related issues
#42891
opened May 17, 2026 by
mohankku
Contributor
Loading…
4 tasks done
Support nvfp4 kv with kv-cache-dtype-skip-layers sliding_window
v1
#42890
opened May 17, 2026 by
sychen52
Contributor
Loading…
4 tasks
[Refactor] Remove dead code
ready
ONLY add when PR is ready to merge/full CI is needed
#42889
opened May 17, 2026 by
yewentao256
Member
Loading…
[Model Runner v2] fix pd accuracy
bug
Something isn't working
kv-connector
ready
ONLY add when PR is ready to merge/full CI is needed
#42888
opened May 17, 2026 by
ZJY0516
Member
Loading…
4 tasks
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.