Skip to content

vox-serve/vox-serve

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

309 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VoxServe: a serving system for SpeechLMs

VoxServe Logo

arXiv Documentation

VoxServe is a serving system for Speech Language Models (SpeechLMs). VoxServe provides low-latency & high-throughput inference for language models trained for speech tokens, specifically text-to-speech (TTS) and speech-to-speech (STS) models.

News

Usage

You can install VoxServe via pip:

pip install vox-serve 
vox-serve --model <model-name> --port <port-number>

Or, you can clone the code and start the inference server with launch.py:

git clone https://github.com/vox-serve/vox-serve.git
cd vox-serve
python -m vox_serve.launch --model <model-name> --port <port-number>

And call the server like this:

# Generate audio from text
curl -X POST "http://localhost:<port-number>/generate" -F "text=Hello world" -F "streaming=true" -o output.wav

# For models supporting audio input
curl -X POST "http://localhost:<port-number>/generate" -F "text=Hello world" -F "@input.wav" -F "streaming=true" -o output.wav

We currently support the following TTS and STS models:

And we are actively working on expanding the support.

./examples folder has more example usage.