MatAnyone is a practical human video matting framework supporting target assignment, with stable performance in both semantics of core regions and fine-grained boundary details.
🎥 For more visual results, go checkout our project page
- [2025.02] Release inference codes and gradio demo 🤗
- [2025.02] This repo is created.
-
Clone Repo
git clone https://github.com/pq-yang/MatAnyone cd MatAnyone
-
Create Conda Environment and Install Dependencies
# create new conda env conda create -n matanyone python=3.8 -y conda activate matanyone # install python dependencies pip install -e . # [optional] install python dependencies for gradio demo pip3 install -r hugging_face/requirements.txt
Download our pretrained model from MatAnyone v1.0.0 to the pretrained_models
folder (pretrained model can also be automatically downloaded during the first inference).
The directory structure will be arranged as:
pretrained_models
|- matanyone.pth
We provide some examples in the inputs
folder. For each run, we take a video and its first-frame segmenatation mask as input. The segmenation mask could be obtained from interactive segmentation models such as SAM2 demo. For example, the directory structure can be arranged as:
inputs
|- video
|- test-sample0 # folder containing all frames
|- test-sample1.mp4 # .mp4, .mov, .avi
|- mask
|- test-sample0_1.png # mask for person 1
|- test-sample0_2.png # mask for person 2
|- test-sample1.png
Run the following command to try it out:
## single target
# short video; 720p
python inference_matanyone.py -i inputs/video/test-sample1.mp4 -m inputs/mask/test-sample1.png
# short video; 1080p
python inference_matanyone.py -i inputs/video/test-sample2.mp4 -m inputs/mask/test-sample2.png
# long video; 1080p
python inference_matanyone.py -i inputs/video/test-sample3.mp4 -m inputs/mask/test-sample3.png
## multiple targets (control by mask)
# obtain matte for target 1
python inference_matanyone.py -i inputs/video/test-sample0 -m inputs/mask/test-sample0_1.png --suffix target1
# obtain matte for target 2
python inference_matanyone.py -i inputs/video/test-sample0 -m inputs/mask/test-sample0_2.png --suffix target2
The results will be saved in the results
folder, including the foreground output video and the alpha output video. If you also want to save the results as per-frame images, you can set --save_image
.
To get rid of the preparation for first-frame segmentation mask, we prepare a gradio demo on hugging face and could also launch locally. Just drop your video/image, assign the target masks with a few clicks, and get the the matting results!
cd hugging_face
# install python dependencies
pip3 install -r requirements.txt # FFmpeg required
# launch the demo
python app.py
By launching, an interactive interface will appear as follow:
If you find our repo useful for your research, please consider citing our paper:
@InProceedings{yang2025matanyone,
title = {{MatAnyone}: Stable Video Matting with Consistent Memory Propagation},
author = {Yang, Peiqing and Zhou, Shangchen and Zhao, Jixin and Tao, Qingyi and Loy, Chen Change},
booktitle = {arXiv preprint arXiv:2501.14677},
year = {2025}
}
This project is licensed under NTU S-Lab License 1.0. Redistribution and use should follow this license.
This project is built upon Cutie, with the interactive demo adapted from ProPainter, leveraging segmentation capabilities from Segment Anything Model and Segment Anything Model 2. Thanks for their awesome works!
If you have any questions, please feel free to reach us at peiqingyang99@outlook.com
.