-
-
Notifications
You must be signed in to change notification settings - Fork 11k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Mamba] -Consolidate Mambas Attention Logic
v1
#28133
opened Nov 5, 2025 by
Josephasafg
•
Draft
1 of 5 tasks
[misc] add vLLM Beijing Meetup
documentation
Improvements or additions to documentation
#28127
opened Nov 5, 2025 by
jjzhang
Loading…
[Chore] Remove Nemotron-Nano-VL config copy
ready
ONLY add when PR is ready to merge/full CI is needed
#28126
opened Nov 5, 2025 by
Isotr0py
Loading…
3 of 5 tasks
[Bugfix]: missing partial content if openai tool calling is enabled
frontend
tool-calling
#28122
opened Nov 5, 2025 by
dr75
Loading…
[bugfix] avoid NIXL_ERR_REMOTE_DISCONNECT in nixl_connector when Prefill dies
kv-connector
#28120
opened Nov 5, 2025 by
hasB4K
Loading…
[Bugfix] Fix Qwen3-Reranker-8B load
qwen
Related to Qwen models
#28117
opened Nov 5, 2025 by
noooop
Loading…
5 tasks
[V0 deprecation]clean up is_v1_supported_oracle
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#28116
opened Nov 5, 2025 by
wangxiyuan
Loading…
5 tasks
Fix the issue where unsupported response format types could cause the inference service to crash, enhancing the robustness of vLLM.
frontend
#28113
opened Nov 5, 2025 by
Jpivot
Loading…
5 tasks
[V0 deprecation] Deprecate use_v1 parameter
rocm
Related to AMD ROCm
tpu
Related to Google TPUs
#28112
opened Nov 5, 2025 by
wangxiyuan
Loading…
5 tasks
[Misc] Remove the duplicate code
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
#28111
opened Nov 5, 2025 by
chaunceyjiang
Loading…
5 tasks
[CLI] add --max-tokens to
vllm complete
frontend
#28109
opened Nov 5, 2025 by
Iceber
Loading…
3 of 5 tasks
Use maximum number of batched tokens to autotune MoE
nvidia
#28106
opened Nov 5, 2025 by
nvjullin
Loading…
5 tasks
[kernel][perf] support uncontiguous input for rms_norm kernel
#28103
opened Nov 5, 2025 by
izhuhaoran
Loading…
[Kernel] Fuse computation of g and beta for Gated Delta Net
qwen
Related to Qwen models
ready
ONLY add when PR is ready to merge/full CI is needed
#28095
opened Nov 5, 2025 by
ZJY0516
Loading…
5 tasks
[Docs] Add guide to debugging vLLM-torch.compile integration
documentation
Improvements or additions to documentation
#28094
opened Nov 5, 2025 by
zou3519
Loading…
Add Flashinfer trtllm moe to compressed tensor FP4 path
#28090
opened Nov 5, 2025 by
Victor49152
•
Draft
5 tasks
[PERF] Decouple projections from GDN custom op. Attempt 2
qwen
Related to Qwen models
ready
ONLY add when PR is ready to merge/full CI is needed
#28083
opened Nov 5, 2025 by
vadiklyutiy
Loading…
remove resolve_op_overloads and use splitting_ops directly
#28081
opened Nov 5, 2025 by
BoyuanFeng
•
Draft
Add runai model streamer e2e test for GCS
ci/build
#28079
opened Nov 4, 2025 by
amacaskill
Loading…
5 tasks done
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.