-
Notifications
You must be signed in to change notification settings - Fork 94
Pull requests: lightseekorg/tokenspeed
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix(deepseek-v4): corrected profiling to estimate cache capacity.
#173
opened May 17, 2026 by
SimonCqk
Contributor
Loading…
perf(moe): triton biased grouped topk for deepseek-v3 routing
#171
opened May 17, 2026 by
roycho96
Contributor
Loading…
[WIP] perf: add gluon fp16 prefill kernel
#165
opened May 16, 2026 by
borontion
Contributor
Loading…
feat(kvstore): support mamba l2 cache transfers
#162
opened May 15, 2026 by
XucSh
Contributor
Loading…
Perf[Qwen3.5]: eliminate Mamba intermediate state memcpy in MTP target-verify
#159
opened May 15, 2026 by
tuanzhangCS
Contributor
•
Draft
[WIP] feat(deepseek-v4): support prefix cache snapshots
#146
opened May 14, 2026 by
SimonCqk
Contributor
Loading…
[Draft]feat(deepseek-v4): support MTP speculative decoding
#123
opened May 13, 2026 by
dongjiyingdjy
Contributor
Loading…
perf(qwen3): cut H100 decode kernel time -8% with fused stride-aware kernels
high priority
#81
opened May 11, 2026 by
qywu
Collaborator
Loading…
5 of 7 tasks
perf: chunked-prefill prefix cache update for non-hybrid models
#22
opened May 7, 2026 by
LorrinWWW
Contributor
Loading…
fix: wait per-layer on drafter KV pool during cpu cache loadback
#6
opened May 6, 2026 by
LorrinWWW
Contributor
Loading…
ProTip!
Follow long discussions with comments:>50.