AVmod is a Audiovisual modulator developed as Project for Third year
- AVmod uses LSGAN and cyclic GAN to achieve the functionality of face swaping and voice modulation
- Himanshu
- Jai
- Karan
- Sagar
-
- File-1
- Responsible for frame extraction, face detection/alignment on input video.
- Detected faces are saved in ./faces/raw_face for non-aligned result and ./faces/aligned_faces for aligned results.
- Crude eyes binary masks saved in ./faces/binary_mask_eye.
-
- File-2
- For datapreprocessing.
- Create binary masks using aligned_faces and save results in ./binary_masks/faceA_eyes and ./binary_masks/faceB_eyes folder.
- Require face_alignment package.
-
- File-3
- Used for model training.
- Require additional training images generated through prep_binary_masks.ipynb
- Save models in ./model directory.
- Save backup models in ./model/backup_iter{iteration_num}.
-
- File-4
- Used for video conversion based on training done in train.ipynb
- Use five-points landmarks for face alignment.
- Pick images that are stored in ./facesA/aligned_faces and ./facesB/aligned_faces for each target.
- Resizing of image will be performed to make images 256x256 for training.
- Training will happen for 40000 iterations (default) can be increased to 80000 and more according to requirement.
- python 3.6.4
- tensorflow r1.15.2
- keras r2.1.5
- opencv
- keras_vggface
- moviepy
- face_alignment
- pathlib
- Create a new Virtual environment with python v3.6.4
- Run command -
pip install -r requirements.txt
- Functionality for voice modulation
- Increase face swapping area
- Binary Mask for mouth
- Interface for easy access to training and conversion
- GPU support for py files (CUDA)
Code borrowed from tjwei and keras-contrib. The generative network is adopted from CycleGAN. Weights and scripts of MTCNN are from FaceNet.