Add gpt_oss model #354

christian-lms · 2025-08-05T15:07:02Z

Contribution by LM Studio Team. Initial implementation, more details to follow. Please note numerical precision considerations vs. the reference implementation.

Co-authored-by: Neil Mehta <neil@lmstudio.ai> Co-authored-by: Matt Clayton <matt@lmstudio.ai>

mlx_lm/models/gpt_oss.py

awni · 2025-08-05T15:20:19Z

mlx_lm/models/gpt_oss.py

+        x = x * mx.expand_dims(expert_weights, axis=2)
+        return x.sum(axis=1)


Sometimes it's important to do the sum in fp32.. maybe check another implementation to see?

mlx_lm/models/gpt_oss.py

psm-2 · 2025-08-05T18:10:12Z

@awni Is fine-tuning included in this update?

niryuu · 2025-08-05T18:31:40Z

ValueError: Unsupported quantization method mxfp4
Do we need additional libraries update?

christian-lms · 2025-08-05T19:47:02Z

@niryuu where are you getting this error?

awni · 2025-08-05T20:01:33Z

Presumably from trying to load the original model in mlx-lm (which is expected to not work). You can reproduce it with e.g. mlx_lm.convert --hf-path openai/gpt-oss-20b. I'll try and add a dequant path for that so we can load them directly.

awni · 2025-08-05T21:15:30Z

Ok you should be able to load the original models now. I added a dequant step if needed:

mlx_lm.convert --hf-path openai/gpt-oss-20b -q --q-bits 8

And:

mlx_lm.generate --model mlx_model --prompt "Write a story about Einstein" -m 512

On an M2 Ultra:

<|channel|>analysis<|message|>The user: "Write a story about Einstein". They want a story about Einstein. We need to produce a story. The user didn't specify length or style. We can choose a creative narrative. Maybe a fictional story where Einstein is a character, or a story about his life, or a whimsical story. We can incorporate his personality, his relativity, his curiosity, his interactions. We can make it a short story, maybe with a twist. The user didn't specify constraints. We can produce a story that is engaging, maybe with a moral. Let's think: We can write a story about Einstein as a child, or as a professor, or as a time traveler. Or we can write a story about a young student who meets Einstein. Or a story about Einstein's relativity in a metaphorical sense. Or a story about Einstein's love for music. Or a story about Einstein's relationship with his wife. Or a story about Einstein's involvement in the Manhattan Project. Or a story about Einstein's philosophical musings. Or a story about Einstein's time in America. Or a story about Einstein's interactions with other scientists. Or a story about Einstein's childhood in Ulm. Or a story about Einstein's later years. Or a story about Einstein's "thought experiment" of riding a beam of light. Or a story about Einstein's "relativity of simultaneity" in a narrative. Or a story about Einstein's "E=mc^2" as a metaphor for love. Or a story about Einstein's "relativity" in a social context. Or a story about Einstein's "relativity" in a comedic way. Or a story about Einstein's "relativity" in a magical realism style. Or a story about Einstein's "relativity" in a sci-fi setting. Or a story about Einstein's "relativity" in a children's story. The user didn't specify. We can choose a creative approach. Let's write a story about a young boy who meets Einstein in a dream, and learns about relativity and the importance of curiosity. Or a story about Einstein's time in Princeton, where he meets a young student. Or a story about Einstein's "thought experiment" of riding a light beam, but in a narrative form. Or a story about Einstein's "relativity" as a metaphor for empathy. Or a story about Einstein's "relativity" in a comedic way. Let's choose a story that is accessible, maybe a short story with a moral. Let's write a story about a young
==========
Prompt: 72 tokens, 224.537 tokens-per-sec
Generation: 512 tokens, 84.333 tokens-per-sec
Peak memory: 22.345 GB

christian-lms · 2025-08-05T21:37:48Z

Yep, it works! Uploading quants now. Are we good to merge?

stakodiak · 2025-08-05T21:50:00Z

Yep. LGTM

awni

Great work @christian-lms and the rest of the LM Studio team!!

psm-2 · 2025-08-06T02:08:20Z

@awni Would it be possible to support fine-tuning?
ValueError: Lora does not support gpt_oss_moe

altaic · 2025-08-06T21:41:00Z

@psm-2 looks like all the other support will land soon in #357

Add gpt_oss model

2abc2ab

Co-authored-by: Neil Mehta <neil@lmstudio.ai> Co-authored-by: Matt Clayton <matt@lmstudio.ai>

awni reviewed Aug 5, 2025

View reviewed changes