-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[#12298][feat] Add Prometheus gauge metrics for Kubernetes Inference Gateway routing
Community want to contribute
PRs initiated from Community
#12564
opened Mar 27, 2026 by
BenjaminBraunDev
Loading…
[None][doc] Add years to README dates and organize into collapsible year sections
Community want to contribute
PRs initiated from Community
#12562
opened Mar 27, 2026 by
devabhixda
Loading…
1 task done
[#12071][fix] Guard SWA block detachment for non-SWA and beam search
Community want to contribute
PRs initiated from Community
#12559
opened Mar 27, 2026 by
wojciech-wais
Loading…
1 task
[https://nvbugs/5919796][test] AutoDeploy: unwaive Super V3 autodeploy failure
#12556
opened Mar 26, 2026 by
galagam
Loading…
1 task done
[https://nvbugs/5916092][fix] Fix MTP+PP hang by preserving speculative layer weights on last PP rank
#12555
opened Mar 26, 2026 by
xxi-nv
Loading…
3 tasks
[None][chore] Fix failing KV Cache Transceiver Tests from #11574
#12554
opened Mar 26, 2026 by
ekou24
Loading…
1 task done
[None][infra] Waive failed multinode tests in GB200 stage
#12553
opened Mar 26, 2026 by
yuanjingx87
Loading…
1 task done
[None][chore] Better Empty File Error Handling for trtllm-bench
#12552
opened Mar 25, 2026 by
yijingl-nvidia
Loading…
1 task done
[None][infra] Skip already-applied patches gracefully in 3rdparty FetchContent
#12550
opened Mar 25, 2026 by
achartier
Loading…
1 task done
[None][infra] Add container scanning to plc nightly pipeline
#12549
opened Mar 25, 2026 by
yuanjingx87
Loading…
1 task done
[None][fix] Replace assertions with warnings for unsupported logits/logprobs in speculative sampler
Community want to contribute
PRs initiated from Community
#12547
opened Mar 25, 2026 by
yifjiang
Loading…
3 tasks
[None][feat] Add production-level Prometheus metrics (iteration stats, config info, token counters, phase histograms)
Community want to contribute
PRs initiated from Community
#12545
opened Mar 25, 2026 by
nvyutwu
Loading…
5 tasks
[None][feat] Enable NVFP4 KV cache support in trtllm-gen attention
#12544
opened Mar 25, 2026 by
yihwang-nv
Loading…
1 task done
[TRTLLMINF-37][feat] Add CI agent failure analysis to L0_MergeRequest…
#12543
opened Mar 25, 2026 by
dpitman-nvda
Loading…
1 task done
[https://nvbugs/6018172][fix] Add synchronization calls to warmup when host cache offloading is active
#12539
opened Mar 25, 2026 by
longlee0622
•
Draft
1 task
[TRTLLM-11318][feat] move VisualGen APIs to a separate dir
VisualGen
#12538
opened Mar 25, 2026 by
zhenhuaw-me
Loading…
1 task done
[None][feat] Add Mamba2 MTP SSM cache CUDA kernel for tree-based speculative decoding
#12537
opened Mar 25, 2026 by
JadoTu
Loading…
1 task done
[None][test] Enhance performance tests by adding GPU availability check in test_perf.py
#12535
opened Mar 25, 2026 by
yufeiwu-nv
Loading…
1 task done
[None][doc] Add MoE developer guide for fused_moe module
#12534
opened Mar 25, 2026 by
xxi-nv
Loading…
2 tasks done
[https://nvbugs/5989920][test] Unwaive DeepSeekV3 nvfp4 mtp3_fp8kv_chunked test
#12533
opened Mar 25, 2026 by
yizhang-nv
Loading…
1 task done
[None][docs] Add docstrings to cpp_custom_ops, model_config, and llm_args
#12532
opened Mar 25, 2026 by
longcheng-nv
Loading…
1 task done
[TRTLLM-10061][feat] Add support of linear attention state for C++ KV cache manager
#12531
opened Mar 25, 2026 by
VALLIS-NERIA
Loading…
2 tasks done
Previous Next
ProTip!
What’s not been updated in a month: updated:<2026-02-28.