Skip to content

[New Model] Aero-1-Audio#658

Merged
Luodian merged 4 commits intomainfrom
dev/aero
Apr 30, 2025
Merged

[New Model] Aero-1-Audio#658
Luodian merged 4 commits intomainfrom
dev/aero

Conversation

@kcz358
Copy link
Collaborator

@kcz358 kcz358 commented Apr 29, 2025

🚀 Introducing Aero-1-Audio — a compact yet mighty audio model.

🧠 Built on Qwen-2.5-1.5B
⚡ Trained in <24h on just 16×H100
🎧 Handles 15+ min audio seamlessly
💡 Outperforms bigger models like Whisper, Qwen-2-Audio & commercial services from ElevenLabs/Scribe

Aero shows: smart data > massive scale.

Github Repo: https://github.com/EvolvingLMMs-Lab/Aero-1
Model Checkpoints: https://huggingface.co/lmms-lab/Aero-1-Audio-1.5B
Evaluation Results: https://github.com/EvolvingLMMs-Lab/lmms-eval/tree/dev/aero
Cookbook: https://www.lmms-lab.com/posts/lmms-lab-docs/aero_audio/

Evaluation Result

20250424_092927_results.json
20250421_203304_results.json
20250421_202840_results.json
20250421_170326_results.json

*Note: for some benchmarks, we use gpt-4o-2024-11-20 as judge model

Examples

We supports batch evaluation for faster inference. Notice that the result might be slightly difference for different batch size

TASK=open_asr_tedlium
CKPT_PATH=lmms-lab/Aero-1-Audio
echo $TASK
TASK_SUFFIX="${TASK//,/_}"
echo $TASK_SUFFIX

accelerate launch --num_processes 8 --main_process_port 30000 -m lmms_eval \
    --model aero \
    --model_args pretrained=$CKPT_PATH,attn_implementation="flash_attention_2" \
    --tasks $TASK \
    --batch_size 32 \
    --log_samples_suffix $TASK_SUFFIX \
    --output_path ./logs/ --verbosity DEBUG

@Luodian Luodian merged commit 819f67e into main Apr 30, 2025
2 checks passed
@kcz358 kcz358 deleted the dev/aero branch December 15, 2025 09:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants