Skip to content

This project aims to develop and test different lip reading algorithms on words and on sentences, using the GRID Corpus Dataset.

Notifications You must be signed in to change notification settings

Baiame/lip_reading_project_public

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Automated Lip Reading

Description of the project

This project aims to develop and test different lip reading algorithms on words and on sentences, using the GRID Corpus Dataset.


Requirements

  • pytorch (CUDA version)
  • tensorboard
  • opencv
  • numpy
  • editdistance
  • sklearn

Sentence Lip Reading

Description of the files

Python File Usage
array_text_conversion.py Contains the functions used to convert CTC array to text
create_the_index_txt_file.py Compute the text files containing the path
dataset_transformers.py Dataset implementation
evaluate_model.py Run evaluation on the given model and create confusion matrices
metrics.py Contains the metric functions
model_dense_transformer_bigru.py Implementation of the sentence lip reading model
dense_modules.py Dense3D modules to construct the front-end network
options.py Parameters file
predict.py Run the demo
predict_live.py Run the live prediction using the webcam
train_model.py Training
video_processing.py Contain data processing functions to create the pre-processed GRID dataset

How to run

Demo

  • Download the weights
  • Add the shape_predictor_68_face_landmarks.dat file in the sentence_lipreading folder if missing.
  • Modify the weights path and the model type in the options.py file if needed.
  • Run predict.py for the demo. (Run predict_live.py for live using the webcam)

Training

  • To train a new model, comment the weights line in options.py otherwise it will continue to train the existing model under the weights path.
  • Download and extract fraction_processed_dataset_slr from the given link above. It contains the lips frames, the alignment files and the txt files that contain the paths of the videos for training and validation.
  • Modify, in the options.py file, the video_path (links to lips folder), the alignment_path (links to alignment folder),the training_list (links to video_paths_list_training.txt) and the validation_list (links to video_paths_list_validation.txt). These paths must link to the dataset folder, and the train and validation text files. Modify save_model_path as well, to where you want.
  • Run train_model.py

Word Lip Reading

Python File Usage
dataset_mouth_shape.py Dataset object of the mouth shapes
dataset_one_word.py Dataset object of the one-word long videos
model_mouth_shape.py Implementation of the mouth shapes model
model_one_word.py Implementation of the one-word long videos model
test.py Run the test of the model
train_one_word.py Train a model on the chosen dataset
video_processing.py Contain video processing tools
compute_images_from_video.py Compute the one-word long videos dataset given the complete GRID Corpus dataset
compute_images_from_videos_light.py Compute a light version of the one-word long videos dataset given the complete GRID Corpus dataset

How to run

Testing

  • Change path of weights file and of the test files in test.py
  • Add the shape_predictor_68_face_landmarks.dat file in the word_lipreading folder if missing.
  • Run test.py

Training

  • Compute the dataset using the GRID Corpus dataset link
  • Run train_one_word.py

About

This project aims to develop and test different lip reading algorithms on words and on sentences, using the GRID Corpus Dataset.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages