ml-explore / mlx-lm Public

Notifications You must be signed in to change notification settings
Fork 679
Star 5.3k

Code
Issues 138
Pull requests 151
Discussions
Actions
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security and quality
Insights

Pull requests: ml-explore/mlx-lm

Labels 9 Milestones 0

New pull request New

151 Open 643 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Add timings to server responses

#1279 opened May 16, 2026 by spicyneuron Contributor

Loading…

Restrict think-state scan to assistant prefill tail

#1277 opened May 15, 2026 by eilidhmae

Loading…

Add Gemma 4 assistant (MTP drafter) model class

#1276 opened May 14, 2026 by broomva

Loading…

fix: make generation_stream per-thread to fix server crash on worker threads

#1275 opened May 14, 2026 by nish2292

Loading…

4 tasks done

feat: add --idle-timeout to unload model after inactivity

#1274 opened May 14, 2026 by nish2292

Loading…

8 tasks done

Add logits processor arguments to mlx_lm.generate

#1273 opened May 13, 2026 by realyxl

Loading…

Support max_kv_size configuration in HTTP server

#1272 opened May 13, 2026 by r-bahuguna

Loading…

Add Olmo3 tool parser

#1271 opened May 11, 2026 by anthonyhchan

Loading…

2 tasks done

docs: add Qwen3 QLoRA Apple Silicon example

#1270 opened May 11, 2026 by SysCd

Loading…

fix(server): wire --prompt-cache-bytes CLI flag to LRUPromptCache

#1267 opened May 11, 2026 by andreinknv

Loading…

feat(server): add /v1/embeddings route via mlx_embeddings

#1265 opened May 9, 2026 by andreinknv

Loading…

Add Granite tool parser

#1264 opened May 9, 2026 by jonpspri

Loading…

Add Responses API support

#1263 opened May 9, 2026 by blairhudson

Loading…

Support for Zyphra/ZAYA1-base

#1261 opened May 9, 2026 by kyr0

Loading…

Fix LFM2.5 tool parser inference

#1260 opened May 8, 2026 by blairhudson

Loading…

Fix server XTC crash from heterogeneous xtc_special_tokens

#1258 opened May 7, 2026 by odysa • Draft

3 of 5 tasks

Fix ArraysCache missing is_trimmable/trim for hybrid model prompt cache

#1254 opened May 6, 2026 by EagerofLight

Loading…

Fix BatchRotatingKVCache rotated flag deserializing to True

#1251 opened May 6, 2026 by odysa

Loading…

Fix mlx_lm.server --adapter-path silently ignored at startup

#1249 opened May 6, 2026 by odysa

Loading…

Drop redundant lm_head AWQ quant triple in load_model

#1247 opened May 6, 2026 by scyyh11

Loading…

add: lfm2/2.5 tool parser

#1246 opened May 5, 2026 by jbuchananr

Loading…

3 tasks done

chore: remove unused imports and variables

#1244 opened May 5, 2026 by odysa

Loading…

fix: wrap ast.literal_eval in try/except for Qwen3 tool parser

#1239 opened May 3, 2026 by lawcontinue

Loading…

fix(gemma4): add stop_gradient on MoE router top_k_indices

#1238 opened May 2, 2026 by TrentCarter

Loading…

Add PLaMo 3 model support

#1234 opened Apr 30, 2026 by mitmul

Loading…

Previous 1 2 3 4 5 6 7 Next

Previous Next

ProTip! Add no:assignee to see everything that’s not assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!