[LoRA]: Add LoRA support to Mistral's Voxtral models #24517

pratapyash · 2025-09-09T15:12:57Z

Add LoRA support to Mistral's Voxtral models (mistralai/Voxtral-Mini-3B-2507 & mistralai/Voxtral-Small-24B-2507), scoped strictly to the language model components.

Changes

Implement SupportsLoRA interface on VoxtralForConditionalGeneration
Add get_mm_mapping() method to filter LoRA modules:

language_model: Text generation component (LoRA-enabled)

connector: audio_language_adapter

tower_model: whisper_encoder (excluded from LoRA)

Update docs/models/supported_models.md to mark Voxtral as LoRA-supported (✅︎)

Closes #24516

gemini-code-assist

Code Review

This pull request adds LoRA support for Mistral's Voxtral models, correctly scoping the adapters to the language model components. The changes are implemented by inheriting from the SupportsLoRA interface and providing a get_mm_mapping method to distinguish between the language model, connector, and audio tower. The implementation is clean, correct, and follows the established patterns in the vLLM codebase for enabling LoRA on multimodal models. The accompanying documentation update is also accurate. Overall, this is a solid contribution.

pratapyash · 2025-09-09T15:20:56Z

To test the implementation, use this test adapter for mistralai/Voxtral-Small-3B-2507: yashpratap/Voxtral-Small-3B-2507-generic-adapter

Script to spin VLLM serve:

export VLLM_LOGGING_LEVEL=DEBUG
export VLLM_LOGGING_PREFIX="[vllm]"

MODEL="mistralai/Voxtral-Mini-3B-2507"
PORT=8000
GPU_MEM_UTIL=0.95
HOST="0.0.0.0"
TENSOR_PARALLEL_SIZE=1
MAX_MODEL_LEN=8192

# LoRA configuration
LORA0_NAME="lora0"
LORA0_PATH="/path/to/your/lora/adapter"
# ---------------------------------------------------------------------------

python -m vllm.entrypoints.openai.api_server \
  --model "$MODEL" \
  --tokenizer-mode mistral \
  --config-format mistral \
  --load-format mistral \
  --host "$HOST" \
  --port "$PORT" \
  --gpu-memory-utilization "$GPU_MEM_UTIL" \
  --tensor-parallel-size "$TENSOR_PARALLEL_SIZE" \
  --max-model-len "$MAX_MODEL_LEN" \
  --trust-remote-code \
  --enable-lora \
  --max-loras 1 \
  --max-lora-rank 32 \
  --lora-modules "$LORA0_NAME=$LORA0_PATH" \
  --served-model-name "$MODEL" \
  --compilation-config '{"level": 3, "cudagraph_mode": "PIECEWISE"}'

Send request:

import requests
import json
import base64

# Test the vLLM server endpoint
url = "http://localhost:8000/v1/chat/completions"
headers = {"Content-Type": "application/json"}

# Read and encode the audio file
audio_path = "/path/to/your/audio/file.wav"
with open(audio_path, "rb") as audio_file:
    audio_data = base64.b64encode(audio_file.read()).decode('utf-8')

payload = {
    "model": "lora0",
    "messages": [
        {"role": "system", "content": "Transcribe the following audio as it is."},
        {"role": "user", "content": [
            {
                "type": "input_audio",
                "input_audio": {
                    "data": audio_data,
                    "format": "wav"
                }
            }
        ]}
    ],
    "max_tokens": 1024,
    "temperature": 0.1
}

response = requests.post(url, headers=headers, json=payload, timeout=30)
result = response.json()
print(json.dumps(result, indent=2))

pratapyash · 2025-09-09T15:21:40Z

cc: @DarkLight1337, can you review and approve if it can be merged?

jeejeelee · 2025-09-10T01:54:24Z

Please fix the branch conflict firstly

Signed-off-by: Yash Pratap Singh <yashsingh20001@gmail.com>

pratapyash · 2025-09-10T05:16:01Z

Conflicts resolved!

vllm/model_executor/models/voxtral.py

jeejeelee

Thank you. I assume you have tested this locally

pratapyash · 2025-09-10T10:37:47Z

Thank you. I assume you have tested this locally

Yes, I did. I also added scripts that I used to test LoRA support for reference.

pratapyash · 2025-09-10T12:59:22Z

@DarkLight1337 most of the tests have passed, we can merge!

) Signed-off-by: Yash Pratap Singh <yashsingh20001@gmail.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>

) Signed-off-by: Yash Pratap Singh <yashsingh20001@gmail.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

feat: add lora (text-only) support for voxtral models

954187f

pratapyash requested review from hmellor and patrickvonplaten as code owners September 9, 2025 15:12

gemini-code-assist bot reviewed Sep 9, 2025

View reviewed changes

mergify bot added the documentation Improvements or additions to documentation label Sep 9, 2025

DarkLight1337 requested a review from jeejeelee September 9, 2025 15:24

Merge branch 'main' into voxtral-lora

e569f48

Signed-off-by: Yash Pratap Singh <yashsingh20001@gmail.com>

Merge branch 'main' into voxtral-lora

6e1a794

jeejeelee reviewed Sep 10, 2025

View reviewed changes

vllm/model_executor/models/voxtral.py Show resolved Hide resolved

pratapyash added 3 commits September 10, 2025 08:44

fix: added packed_modules_mapping

744df58

Merge branch 'main' into voxtral-lora

91caef5

Merge branch 'main' into voxtral-lora

363133b

pratapyash requested a review from jeejeelee September 10, 2025 08:51

jeejeelee approved these changes Sep 10, 2025

View reviewed changes

Merge branch 'main' into voxtral-lora

5414432

jeejeelee added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 10, 2025

jeejeelee enabled auto-merge (squash) September 10, 2025 10:28

vllm-bot merged commit 9e3c3a7 into vllm-project:main Sep 10, 2025
47 of 50 checks passed

skyloevil pushed a commit to skyloevil/vllm that referenced this pull request Sep 13, 2025

[LoRA]: Add LoRA support to Mistral's Voxtral models (vllm-project#24517

3a189f0

) Signed-off-by: Yash Pratap Singh <yashsingh20001@gmail.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[LoRA]: Add LoRA support to Mistral's Voxtral models (vllm-project#24517

f117f7f

) Signed-off-by: Yash Pratap Singh <yashsingh20001@gmail.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[LoRA]: Add LoRA support to Mistral's Voxtral models #24517

[LoRA]: Add LoRA support to Mistral's Voxtral models #24517

Uh oh!

pratapyash commented Sep 9, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

pratapyash commented Sep 9, 2025 •

edited

Loading

Uh oh!

pratapyash commented Sep 9, 2025

Uh oh!

jeejeelee commented Sep 10, 2025

Uh oh!

pratapyash commented Sep 10, 2025

Uh oh!

Uh oh!

jeejeelee left a comment

Uh oh!

pratapyash commented Sep 10, 2025

Uh oh!

pratapyash commented Sep 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[LoRA]: Add LoRA support to Mistral's Voxtral models #24517

[LoRA]: Add LoRA support to Mistral's Voxtral models #24517

Uh oh!

Conversation

pratapyash commented Sep 9, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

pratapyash commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pratapyash commented Sep 9, 2025

Uh oh!

jeejeelee commented Sep 10, 2025

Uh oh!

pratapyash commented Sep 10, 2025

Uh oh!

Uh oh!

jeejeelee left a comment

Choose a reason for hiding this comment

Uh oh!

pratapyash commented Sep 10, 2025

Uh oh!

pratapyash commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pratapyash commented Sep 9, 2025 •

edited by github-actions bot

Loading

pratapyash commented Sep 9, 2025 •

edited

Loading

pratapyash commented Sep 10, 2025 •

edited

Loading