-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
ui: silence a11y caption warning and tidy vitest setup
examples
server/ui
#23293
opened May 18, 2026 by
ServeurpersoCom
Contributor
Loading…
Move to backend sampling for MTP draft path
#23287
opened May 18, 2026 by
gaugarg-nv
Contributor
Loading…
common: fix --fit verbosity with --verbosity 4
examples
#23282
opened May 18, 2026 by
JohannesGaessler
Contributor
Loading…
server-context: fall back to full seq clear when partial KV eviction is refused
examples
server
#23280
opened May 18, 2026 by
ServeurpersoCom
Contributor
Loading…
server : print graphs reused in slot timings
examples
server
#23279
opened May 18, 2026 by
ggerganov
Member
Loading…
common: fix --help for --verbosity
#23278
opened May 18, 2026 by
JohannesGaessler
Contributor
Loading…
github: mention --log-file in issue templates
devops
improvements to build systems and github actions
#23277
opened May 18, 2026 by
JohannesGaessler
Contributor
Loading…
ui: prevent checkbox click from propagating in tools submenu
examples
server/ui
#23276
opened May 18, 2026 by
MaxKruse
Loading…
StepFun 3.5 MTP
model
Model specific
python
python script changes
script
Script related
#23274
opened May 18, 2026 by
pwilkin
Member
Loading…
rpc : keep last_graph_uid in the device context
ggml
changes relating to the ggml tensor library for machine learning
merge ready
A maintainer can use this label to indicate that they consider the changes final and ready to merge.
#23273
opened May 18, 2026 by
rgerganov
Member
Loading…
ci : install server kleidiai runner dependencies
devops
improvements to build systems and github actions
#23259
opened May 18, 2026 by
CISC
Member
Loading…
Fix imatrix generation for MTP models
examples
python
python script changes
#23258
opened May 18, 2026 by
de-wim
Loading…
mtmd: add --mmproj-device argument
examples
server
#23255
opened May 18, 2026 by
Interpause
Loading…
4 tasks done
NvFp4 CT and Fp8 as Q8 conversion support
python
python script changes
#23250
opened May 18, 2026 by
ynankani
Contributor
Loading…
Add path validation and exit code error handling to bench script
script
Script related
#23248
opened May 18, 2026 by
Eamon2009
Loading…
common : support schema-constrained decoding for Gemma 4 tool calls
testing
Everything test related
#23247
opened May 18, 2026 by
rsauciuc
Loading…
rpc : track last graph uid per (endpoint, device), not per backend context
ggml
changes relating to the ggml tensor library for machine learning
#23243
opened May 18, 2026 by
ssam18
Contributor
Loading…
fit : add --fit-show-mem to print probe table at INFO
examples
#23232
opened May 17, 2026 by
Bikkies
Loading…
vulkan: fix GDN shader on MoltenVK by replacing gl_SubgroupInvocationID with gl_LocalInvocationIndex
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#23228
opened May 17, 2026 by
csfercoci
Loading…
CUDA: route batch>=4 quantized matmul to MMQ on AMD MFMA hardware
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#23227
opened May 17, 2026 by
jadenmach2
Loading…
server + ui: SSE Replay Buffer
examples
server/ui
server
#23226
opened May 17, 2026 by
ServeurpersoCom
Contributor
•
Draft
ggml: allow split-mode tensor to use different quantization types.
ggml
changes relating to the ggml tensor library for machine learning
#23225
opened May 17, 2026 by
RedToasty
Loading…
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.