Skip to content

Pull requests: ml-explore/mlx-lm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add timings to server responses
#1279 opened May 16, 2026 by spicyneuron Contributor Loading…
Restrict think-state scan to assistant prefill tail
#1277 opened May 15, 2026 by eilidhmae Loading…
Add Gemma 4 assistant (MTP drafter) model class
#1276 opened May 14, 2026 by broomva Loading…
feat: add --idle-timeout to unload model after inactivity
#1274 opened May 14, 2026 by nish2292 Loading…
8 tasks done
Add logits processor arguments to mlx_lm.generate
#1273 opened May 13, 2026 by realyxl Loading…
Support max_kv_size configuration in HTTP server
#1272 opened May 13, 2026 by r-bahuguna Loading…
Add Olmo3 tool parser
#1271 opened May 11, 2026 by anthonyhchan Loading…
2 tasks done
docs: add Qwen3 QLoRA Apple Silicon example
#1270 opened May 11, 2026 by SysCd Loading…
Add Granite tool parser
#1264 opened May 9, 2026 by jonpspri Loading…
Add Responses API support
#1263 opened May 9, 2026 by blairhudson Loading…
Support for Zyphra/ZAYA1-base
#1261 opened May 9, 2026 by kyr0 Loading…
Fix LFM2.5 tool parser inference
#1260 opened May 8, 2026 by blairhudson Loading…
Fix server XTC crash from heterogeneous xtc_special_tokens
#1258 opened May 7, 2026 by odysa Draft
3 of 5 tasks
Drop redundant lm_head AWQ quant triple in load_model
#1247 opened May 6, 2026 by scyyh11 Loading…
add: lfm2/2.5 tool parser
#1246 opened May 5, 2026 by jbuchananr Loading…
3 tasks done
chore: remove unused imports and variables
#1244 opened May 5, 2026 by odysa Loading…
Add PLaMo 3 model support
#1234 opened Apr 30, 2026 by mitmul Loading…
ProTip! Add no:assignee to see everything that’s not assigned.