Skip to content

Mohammed2372/Translation-API

Repository files navigation

🌍 Universal AI Translator API (NLLB-200)

Python Django Redis Celery Docker

A high-performance translation API supporting 200+ languages using Meta's NLLB-200 model. Features optimized INT8 inference (600MB), event-driven architecture with Celery/Redis, and production-ready Docker deployment.


✨ Key Features

  • ⚡ Fast & Lightweight: CTranslate2 INT8 quantization (4x faster, 2.5GB → 600MB)
  • 🔄 Asynchronous Processing: Celery + Redis for non-blocking operations
  • 🚀 Production Ready: Dockerized, Swagger docs, zero cold-start with singleton pattern
  • 🌐 200+ Languages: English, Arabic, French, Chinese, Spanish, and more

🚀 Quick Start

Option 1: Docker Hub (Fastest - Recommended)

Run the pre-built containers directly from Docker Hub:

# Clone the repo
git clone https://github.com/Mohammed2372/Translation-API.git
cd Translation-API

# Pull and run the complete stack
docker-compose -f docker-compose.prod.yml up

This will automatically pull:

  • mohammed237/translation-api:webv1 (Django API)
  • mohammed237/translationapi:workerv1 (Celery worker with model)
  • redis:7-alpine (official Redis image)

Access API: http://localhost:8000/api/docs/

Option 2: Build from Source

Prerequisites

  • Docker & Docker Compose OR Python 3.10+ & Redis

Step 1: Get the Model

Option A - Use Kaggle Notebook (5 min):

  1. Open: Model Preparation Notebook
  2. Click "Copy and Edit" → Run all cells
  3. Download translator_model.zip

Option B - Run Locally:

jupyter notebook notebooks/quantize_translator_model.ipynb
# test model
python test_translator_model.py

Step 2: Extract Model

/translationapi
├── model/
│   └── translator_model/      <-- Extract here
│       ├── model.bin
│       ├── config.json
│       └── shared_vocabulary.json

Step 3: Run

# With Docker
docker-compose up --build

# OR Manually (2 terminals)
# Terminal 1
python manage.py runserver

# Terminal 2
celery -A core worker -l info  # Add -P solo on Windows

📖 API Usage

Endpoint: POST /api/translate/

Request:

{
  "text": "Hello, how are you?",
  "source": "eng_Latn",
  "target": "arb_Arab"
}

Response:

{
  "status": "completed",
  "result": {
    "translated": "مرحبا، كيف حالك؟",
    "original": "Hello, how are you?"
  }
}

Interactive Docs: http://localhost:8000/api/docs/


🏗️ Architecture

graph LR
    A[Client] -->|HTTP| B[Django API]
    B -->|Task| C[Redis Queue]
    C --> D[Celery Worker]
    D -->|INT8 Model| E[CTranslate2]
Loading

Components:

  • Django API: Request validation, task dispatch
  • Redis: Message queue, traffic buffering
  • Celery Worker: Background AI processing (model loaded once)
  • CTranslate2: Optimized INT8 inference engine

🧠 Technical Highlights

Why CTranslate2?

Standard PyTorch is slow and heavy. CTranslate2 provides:

  • 4x faster inference via C++ optimization
  • INT8 quantization: 2.5GB → 600MB
  • CPU-friendly (no GPU needed)

Singleton Pattern

Model loads once at worker startup (translationapi/ai_loader.py), eliminating 5+ second cold starts per request.

Smart Polling

API waits up to 10 seconds for results (feels real-time), then gracefully returns task ID for async checks under heavy load.


📁 Project Structure

translationapi/
├── translationapi/
│   ├── views.py              # API endpoints
│   ├── tasks.py              # Celery tasks
│   └── ai_loader.py          # Model singleton
├── model/
│   └── translator_model/     # INT8 model (600MB)
├── notebooks/
│   ├── quantize_translator_model.ipynb
│   └── test_translator_model.py  # test model if working
├── docker-compose.yml        # Build images
├── docker-compose.prod.yml   # Production with Docker Hub images
└── requirements.txt

📊 Performance

Metric Value
Model Size 600MB (INT8)
Inference ~200ms/sentence (CPU)
Memory ~800MB total
Throughput 100+ req/min

📧 Links