Skip to content

Subtitle translation API using GPT. Dynamic context windows preserve conversational flow. Features auto-fallback to DeepL, concurrent processing, and FastAPI.

Notifications You must be signed in to change notification settings

yigitkonur/subtitle-llm-translator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

5 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŽฌ context-aware-srt-translation ๐ŸŽฌ

Stop translating subtitles line by line. Start shipping natural translations.

The smarter subtitle translator. It reads your SRT, groups sequential lines for context, and uses GPT to produce translations that actually sound human.

python fastapi ย ย โ€ขย ย  license platform

context window auto fallback


context-aware-srt-translation is the translator your subtitles deserve. Stop feeding GPT one line at a time and getting robotic, disconnected results. This service groups sequential subtitle lines together, giving the AI the context it needs to understand the conversation and produce translations that actually flow naturally.

๐Ÿง 

Context Windows
3 lines translated together

โšก

Concurrent Processing
Parallel chunk translation

๐Ÿ”„

Auto Fallback
OpenAI โ†’ DeepL seamlessly

How it works:

  • You: POST your SRT file to the API
  • Service: Groups lines into context windows, translates concurrently
  • Result: Natural translations that respect conversational flow
  • Bonus: Full statistics on what happened

๐Ÿ’ฅ Why This Slaps Other Methods

Line-by-line translation is a vibe-killer. Context windows make other methods look ancient.

โŒ Line-by-Line (Pain) โœ… Context Windows (Glory)
"I think we should..."  โ†’  "Sanฤฑrฤฑm biz..."
"...go there tomorrow"  โ†’  "...yarฤฑn oraya git"
Disconnected. Robotic. Wrong verb forms.
["I think we should...",
 "...go there tomorrow"]  โ†’  
["Bence yarฤฑn oraya...",
 "...gitmeliyiz"]
Connected. Natural. Correct grammar.

The difference is context. When GPT sees the full thought, it understands the sentence structure, maintains speaker tone, and produces translations humans would actually write.


๐Ÿš€ Get Started in 60 Seconds

1. Clone & Install

git clone https://github.com/yigitkonur/context-aware-srt-translation-gpt.git
cd context-aware-srt-translation-gpt
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

2. Configure

cp .env.example .env
# Add your OpenAI API key (required)
# Add DeepL API key (optional fallback)

3. Run

python run.py

The API is now live at http://localhost:8000 ๐ŸŽ‰


๐Ÿง  How Context Windows Work

Instead of translating each subtitle line individually (which loses context), this service groups sequential lines:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Traditional: Line 1 โ†’ Translate โ†’ Output 1    โ”‚
โ”‚               Line 2 โ†’ Translate โ†’ Output 2    โ”‚
โ”‚               Line 3 โ†’ Translate โ†’ Output 3    โ”‚
โ”‚               โŒ No context between lines       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Context Window:                                โ”‚
โ”‚  [Line 1, Line 2, Line 3] โ†’ Translate Together  โ”‚
โ”‚               โ†“                                 โ”‚
โ”‚  [Output 1, Output 2, Output 3]                 โ”‚
โ”‚               โœ… AI sees the full picture       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

This allows GPT to:

  • Maintain speaker continuity โ€” Same character, same voice
  • Preserve conversation flow โ€” Questions match answers
  • Handle split sentences โ€” "I think..." + "...we should go" = coherent thought
  • Respect cultural context โ€” Idioms translated appropriately

๐ŸŽฎ API Usage

Translate Subtitles

curl -X POST "http://localhost:8000/subtitle-translate" \
  -H "Content-Type: application/json" \
  -d '{
    "srt_content": "1\n00:00:01,000 --> 00:00:04,000\nHello, how are you?\n\n2\n00:00:05,000 --> 00:00:08,000\nI am doing great, thanks!",
    "source_language": "en",
    "target_language": "tr"
  }'

Response

{
  "translated_srt_content": "1\n00:00:01,000 --> 00:00:04,000\nMerhaba, nasฤฑlsฤฑn?\n\n2\n00:00:05,000 --> 00:00:08,000\nร‡ok iyiyim, teลŸekkรผrler!",
  "status": "success",
  "error_message": null,
  "stats": {
    "total_sentences": 2,
    "translated_sentences": 2,
    "failed_sentences": 0,
    "success_rate": 100.0,
    "openai_calls": 1,
    "deepl_calls": 0,
    "elapsed_seconds": 1.23
  }
}

Health Check

curl http://localhost:8000/health
# {"status": "healthy", "version": "2.0.0"}

โš™๏ธ Configuration

All settings via environment variables:

Variable Default Description
OPENAI_API_KEY โ€” Required. Your OpenAI API key
DEEPL_API_KEY โ€” Optional fallback service
OPENAI_MODEL gpt-4o-mini Model for translations
OPENAI_TEMPERATURE 0.3 Lower = more consistent
CONTEXT_WINDOW_SIZE 3 Lines per translation chunk
MAX_CONCURRENT_REQUESTS 10 Parallel API calls
LOG_LEVEL INFO Logging verbosity

๐Ÿ“ Project Structure

src/
โ”œโ”€โ”€ config.py              # Environment configuration
โ”œโ”€โ”€ models.py              # Pydantic request/response models
โ”œโ”€โ”€ srt_parser.py          # SRT parsing & reconstruction
โ”œโ”€โ”€ translator.py          # Main orchestration logic
โ”œโ”€โ”€ main.py                # FastAPI application
โ””โ”€โ”€ services/
    โ”œโ”€โ”€ base.py            # Service interface
    โ”œโ”€โ”€ openai_service.py  # OpenAI implementation
    โ””โ”€โ”€ deepl_service.py   # DeepL fallback

๐Ÿ”ฅ API Documentation

Interactive docs available when running:

  • Swagger UI: http://localhost:8000/docs
  • ReDoc: http://localhost:8000/redoc

๐Ÿ› ๏ธ Development

# Setup
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# Run tests
pytest tests/ -v

# Run with hot reload
python run.py

๐Ÿ”ฅ Common Issues

Problem Solution
OpenAI rate limit Reduce MAX_CONCURRENT_REQUESTS
DeepL not working Check DEEPL_API_KEY is set correctly
Translations cut off Increase OPENAI_MAX_TOKENS
Wrong language codes Use ISO 639-1 codes: en, tr, de, fr, etc.

Built with ๐Ÿ”ฅ because line-by-line subtitle translation is a crime against cinema.

MIT ยฉ YiฤŸit Konur

About

Subtitle translation API using GPT. Dynamic context windows preserve conversational flow. Features auto-fallback to DeepL, concurrent processing, and FastAPI.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages