-
Notifications
You must be signed in to change notification settings - Fork 679
Pull requests: ml-explore/mlx-lm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix: make generation_stream per-thread to fix server crash on worker threads
#1275
opened May 14, 2026 by
nish2292
Loading…
4 tasks done
feat: add --idle-timeout to unload model after inactivity
#1274
opened May 14, 2026 by
nish2292
Loading…
8 tasks done
fix(server): wire --prompt-cache-bytes CLI flag to LRUPromptCache
#1267
opened May 11, 2026 by
andreinknv
Loading…
feat(server): add /v1/embeddings route via mlx_embeddings
#1265
opened May 9, 2026 by
andreinknv
Loading…
Fix ArraysCache missing is_trimmable/trim for hybrid model prompt cache
#1254
opened May 6, 2026 by
EagerofLight
Loading…
Fix BatchRotatingKVCache rotated flag deserializing to True
#1251
opened May 6, 2026 by
odysa
Loading…
Fix mlx_lm.server --adapter-path silently ignored at startup
#1249
opened May 6, 2026 by
odysa
Loading…
fix: wrap ast.literal_eval in try/except for Qwen3 tool parser
#1239
opened May 3, 2026 by
lawcontinue
Loading…
fix(gemma4): add stop_gradient on MoE router top_k_indices
#1238
opened May 2, 2026 by
TrentCarter
Loading…
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.