Skip to content

Feature request: Add FunASR/SenseVoice model integration #3058

@LauraGPT

Description

@LauraGPT

Feature Request

Add FunASR models (SenseVoice, Paraformer) as available ASR models in SpeechBrain.

Why

SpeechBrain is the go-to speech processing toolkit. FunASR models would be a valuable addition:

Model Params Architecture Languages Speed
SenseVoice-Small 234M Non-autoregressive 50+ 25x RT (CPU)
Paraformer-large 220M Non-autoregressive zh/en/ja/ko/yue 170x RT (GPU)
Fun-ASR-Nano 800M LLM-based (encoder+decoder) 31 GPU via vLLM
cam++ 7.2M Speaker diarization - -
FSMN-VAD 5.2M Voice activity detection - -

Key advantages

  • Non-autoregressive: Paraformer and SenseVoice avoid hallucination issues common in autoregressive models
  • Industrial-grade: Deployed at scale (1M+ pip installs/month)
  • Complete pipeline: VAD + ASR + punctuation + diarization in one package
  • HuggingFace models: Available on HF Hub (funasr/paraformer-large, funasr/sensevoice-small)
  • Transformers integration in progress (PR #46180)

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions