A full-stack AI-powered web application that recognizes American Sign Language (ASL) gestures from live webcam feed or uploaded images and translates them into text and speech in real-time.
- 🎥 Real-time Webcam Capture - Capture ASL gestures directly from your webcam
- 📤 Image Upload - Upload pre-recorded gesture images
- 🤖 Dual AI Models - Switch between VGG16 and ResNet50 for prediction
- 🔊 Text-to-Speech - Hear predictions spoken aloud using Web Speech API
- 📝 Sentence Builder - Accumulate multiple predictions to build complete sentences
- 🎨 Modern UI/UX - Beautiful gradient designs with smooth animations
- 📱 Responsive Design - Works seamlessly on desktop and mobile devices
- ⚡ High Performance - Optimized for fast predictions and smooth user experience
- React 18 - Modern UI library
- Vite - Next-generation frontend tooling
- Tailwind CSS - Utility-first CSS framework
- Framer Motion - Smooth animations and transitions
- Lucide React - Beautiful icon set
- React Webcam - Webcam capture functionality
- Flask - Lightweight Python web framework
- TensorFlow/Keras - Deep learning model inference
- OpenCV - Image processing
- Flask-CORS - Cross-origin resource sharing
- VGG16 - 16-layer convolutional neural network
- ResNet50 - 50-layer residual neural network
sign-language-translator/
│
├── backend/
│ ├── app.py # Flask API server
│ ├── utils.py # Image preprocessing & model loading
│ ├── requirements.txt # Python dependencies
│ ├── labels.json # 40 ASL class labels
│ ├── model_vgg16.h5 # Pre-trained VGG16 model (to be added)
│ └── model_resnet.h5 # Pre-trained ResNet50 model (to be added)
│
├── frontend/
│ ├── src/
│ │ ├── components/
│ │ │ ├── Navbar.jsx # Top navigation bar
│ │ │ ├── WebcamCapture.jsx # Webcam & image capture
│ │ │ ├── PredictionBox.jsx # Results display & sentence builder
│ │ │ └── Footer.jsx # Footer with credits
│ │ ├── App.jsx # Main application component
│ │ ├── main.jsx # React entry point
│ │ └── index.css # Global styles
│ ├── package.json # Node.js dependencies
│ ├── vite.config.js # Vite configuration
│ ├── tailwind.config.js # Tailwind CSS configuration
│ └── index.html # HTML template
│
└── README.md # This file
- Node.js (v18 or higher) - Download
- Python (v3.8 or higher) - Download
- pip - Python package manager (comes with Python)
The easiest way to run the application with automated setup:
Simply double-click start_all.bat or run:
start_all.bat./start_all.shThat's it! The script will:
- ✅ Automatically check and install all requirements (only if needed)
- ✅ Start both backend and frontend servers
- ✅ Open the application in your browser
First run takes ~3 minutes (installation). Next runs take ~10 seconds!
See STARTUP_GUIDE.md for detailed information about the startup scripts.
If you prefer manual setup:
git clone <repository-url>
cd sign-language-translatorNavigate to the backend directory and install dependencies:
cd backend
pip install -r requirements.txtImportant: Place your pre-trained models in the backend/ directory:
model_vgg16.h5- VGG16 trained modelmodel_resnet.h5- ResNet50 trained model
If models are not available, the application will use dummy models for testing (predictions will be random).
Navigate to the frontend directory and install dependencies:
cd ../frontend
npm installWindows:
start_all.batLinux/Mac:
./start_all.shYou'll need to run both the backend and frontend servers simultaneously.
Terminal 1 - Backend Server:
cd backend
python app.pyThe Flask API will start on http://localhost:5000
Terminal 2 - Frontend Server:
cd frontend
npm run devThe React app will start on http://localhost:3000
Open your browser and navigate to http://localhost:3000
Windows:
stop_all.batLinux/Mac:
./stop_all.shOr press Ctrl+C in the terminal/command windows.
-
Run the startup script:
- Windows: Double-click
start_all.bat - Linux/Mac: Run
./start_all.sh
- Windows: Double-click
-
Wait for servers to start (~10 seconds)
-
Browser opens automatically to http://localhost:3000
- Allow Camera Access - Grant webcam permissions when prompted
- Select AI Model - Choose between VGG16 or ResNet50
- Capture Gesture - Click "Capture & Translate" or upload an image
- View Translation - See the predicted ASL sign with confidence score
- Listen - Click the speaker icon to hear the translation
- Build Sentences - Click "Add to Sentence" to accumulate multiple signs
- Speak Sentence - Convert the entire sentence to speech
- Windows: Run
stop_all.bator close the server windows - Linux/Mac: Run
./stop_all.shor pressCtrl+C
The application recognizes 40 different ASL signs:
A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z
0, 1, 2, 3, 4, 5, 6, 7, 8, 9
- HELLO
- THANK YOU
- I LOVE YOU
- PLEASE
Health check endpoint
{
"status": "success",
"message": "Sign Language Translator API is running",
"version": "1.0",
"models_available": ["vgg16", "resnet50"],
"total_classes": 40
}Predict ASL gesture from image
Form Data:
image: Image file (JPEG/PNG)model: Model name ("vgg16" or "resnet50")
Response:
{
"prediction": "A",
"confidence": 95.67,
"model_used": "vgg16",
"class_index": 0
}Get available models
{
"available_models": ["vgg16", "resnet50"],
"default_model": "vgg16"
}Get all class labels
{
"labels": ["A", "B", "C", ...],
"total_classes": 40
}- Gradient Backgrounds - Eye-catching purple-to-blue gradients
- Neon Glow Effects - Modern glow effects on interactive elements
- Smooth Animations - Framer Motion for fluid transitions
- Responsive Layout - Mobile-first design approach
- Dark Theme - Easy on the eyes with vibrant accents
- Professional Icons - Lucide React icon library
- Create a
Dockerfilein the backend directory:
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "app:app"]- Deploy to your preferred platform with environment variables:
PORT=5000FLASK_ENV=production
- Build the production version:
npm run build- Update API endpoint in
App.jsxto your deployed backend URL - Deploy the
distfolder to Netlify/Vercel
If you want to train your own models:
- Collect ASL gesture dataset (images 64x64)
- Organize into 40 class folders
- Use TensorFlow/Keras to train VGG16 or ResNet50
- Save models as
.h5files - Place in
backend/directory
Example training code structure:
from tensorflow.keras.applications import VGG16
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Flatten
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(64, 64, 3))
x = Flatten()(base_model.output)
x = Dense(256, activation='relu')(x)
predictions = Dense(40, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# Train your model...
model.save('model_vgg16.h5')Error: "No module named 'tensorflow'"
pip install tensorflow==2.15.0Error: "Cannot load model"
- Ensure model files are in the
backend/directory - Check file names match exactly:
model_vgg16.h5andmodel_resnet.h5
CORS Error
- Ensure
flask-corsis installed - Verify CORS is enabled in
app.py
Error: "Failed to fetch"
- Ensure backend server is running on port 5000
- Check browser console for detailed error messages
Webcam Not Working
- Grant camera permissions in browser settings
- Use HTTPS or localhost (required for webcam access)
- Check if another application is using the camera
Build Errors
# Clear cache and reinstall
rm -rf node_modules package-lock.json
npm install- Average Prediction Time: < 500ms
- Model Accuracy: 90-95% (depends on training data quality)
- Supported Image Formats: JPEG, PNG
- Max Image Size: 10MB
- Browser Compatibility: Chrome, Firefox, Safari, Edge (latest versions)
- No images are stored on the server
- All processing happens in real-time
- Webcam access requires user permission
- No personal data is collected
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is open-source and available under the MIT License.
Aswin S
- 🎓 B.Sc. Computer Science (AI & Data Science)
- 🏫 Sree Narayan Guru College
- 🔬 Focus Areas: AI, Deep Learning, Computer Vision, Web AI Systems
- ASL dataset contributors
- TensorFlow & Keras teams
- React & Vite communities
- Open-source community
For questions, feedback, or collaboration opportunities, please reach out:
- Email: [Your Email]
- LinkedIn: [Your LinkedIn]
- GitHub: [Your GitHub]
Made with ❤️ for accessibility and inclusion
Empowering communication through AI