A bidirectional translator for Arabic Sign Language, converting speech → sign and sign → speech in real-time using your webcam.
This project supports letters, numbers, and words/phrases recognition with live Arabic text output.
- Type a word or sentence manually or use your voice 🎤.
- Displays animated GIFs for words in Arabic.
- Supported words include:
"mom","Alhamdulillah","angry","house","how are you", and more. - GIFs are fetched directly from online URLs 🌐.
- Real-time hand sign recognition via webcam 🎥.
- Supports three model types:
- Numbers (0–10)
- Arabic letters
- Words/Phrases
- Uses MediaPipe Hands for hand tracking 🖐.
- Recognizes repeated signs and speaks them aloud in Arabic 🔊.
- Displays shaped Arabic text on screen using
arabic_reshaper+python-bidi.
- Collected ~500 images per class for letters, numbers, and words.
- Landmarks extracted and saved in pickle files for training.
- Trained Random Forest classifiers for each category.
- Real-time predictions use these pre-trained models.
- Start/stop the camera 🎥
- Enable/disable speech 🔊
- Toggle between right/left hand ✋
- Clear recognized text 🧹
Python 3.8+ and the following libraries:
pip install customtkinter pillow requests SpeechRecognition opencv-python mediapipe numpy arabic-reshaper python-bidi gTTS pygame scikit-learnSee how typed or spoken text is converted to animated sign GIFs instantly.
Real-time hand sign recognition shows the corresponding Arabic text and speaks it aloud.
- Go to the "Text → Sign" tab.
- Type a word and click Show GIF, or click Click to Record to use speech input.
- The corresponding GIF will appear.
- Go to the "Sign → Speech" tab.
- Select Numbers, Letters, or Words.
- The webcam will detect your hand signs.
- Recognized signs are:
- Spoken aloud in Arabic 🔊
- Displayed as shaped Arabic text 🖋
- Use the Clear Text button 🧹 to reset.
- Toggle hand using Use Left/Right Hand button ✋
- Audio is cached in
audio_cache/for faster playback 🎧 - Left-hand recognition mirrors coordinates for accuracy
↔️ - Minimum interval prevents repeated audio output ⏱
- Internet is required to display GIFs 🌐
- Maintain equal number of images per class (~500) for better accuracy
- Models can be upgraded to CNN/LSTM for higher accuracy 🤖
- Expand dataset with more words and phrases 📚
- Implement deep learning models for higher accuracy 🔥
- Add a GUI interface for interactive feedback 🖥
- Support multiple hands and gestures simultaneously ✌️

