This project is a deep learning model that can read lips — it takes short, silent video clips of someone speaking and tries to figure out what they’re saying just by analyzing their lip movements. What it does:
Loads short video clips and breaks them down into individual frames
Focuses only on the mouth area and turns the video into black and white
Matches the video with text transcripts (what the person was saying)
Trains a neural network to recognize patterns in lip movement
Predicts the spoken words without using any audio