This is the official implementation of the paper:
Wang, Jian et al. "Egocentric Whole-Body Motion Capture with FisheyeViT and Diffusion-Based Motion Refinement." CVPR. (2024).
[Project Page] <-- Everything will be updated here.
[EgoWholeBody Training Dataset]
We base our code on the 0.x version of MMPose.
- First create the conda environment and activate it:
conda create -n egowholemocap python=3.9 -y
conda activate egowholemocap
- Then install the pytorch version (tested on python 1.3.x) that matches your CUDA version. For example, if you have CUDA 11.7, you can install pytorch 1.13.1 with the following command:
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
- Install this project:
pip install openmim
mim install mmcv-full==1.7.1
pip install -e .
- Install the dependencies of this project:
pip3 install -r requirements.txt
- If smplx installed open3d-python, you should uninstall it by running:
pip uninstall open3d-python
-
Change the torchgeometry code following this issue.
-
Finally, download the mano hand model, then put it under
./human_models/
. The structure of./human_models/
should be like this:
human_models
|-- mano
|-- |-- MANO_RIGHT.pkl
|-- |-- MANO_LEFT.pkl
- Download the pretrained human body pose estimation model (FisheyeViT + pixel-aligned 3D heatmap) from NextCloud and put it under
./checkpoints/
. - Download the pretrained hand detection model from NextCloud and put it under
./checkpoints/
. - Download the pretrained hand pose estimation model from NextCloud and put it under
./checkpoints/
. - Download the pretrained whole-body motion diffusion model from NextCloud and put it under
./checkpoints/
.
The input data should be a image sequence in directory ./demo/resources/imgs/
.
For example, you can download the example sequence from NextCloud, unzip the file and put it under ./demo/resources/
.
tools/python_test.sh configs/egofullbody/egowholebody_single_demo.py none
The result data will be saved in ./work_dirs/egowholebody_single_demo
.
Note: the headless server is not supported.
python scripts/visualization_script/vis_single_frame_whole_body_result.py \
--pred_path work_dirs/egowholebody_single_demo/outputs.pkl \
--image_id 0
python demo/demo_whole_body_diffusion.py \
--pred_path work_dirs/egowholebody_single_demo/outputs.pkl
The result will be saved in ./work_dirs/egowholebody_diffusion_demo
.
Note: the headless server is not supported.
python scripts/visualization_script/vis_diffusion_whole_body_result.py \
--pred_path work_dirs/egowholebody_diffusion_demo/outputs.pkl \
--image_id 0
-
Download the EgoWholeBody synthetic dataset from NextCloud.
-
Unzip all of the files. The file structure should be like this:
path_to_dataset_dir
|-- renderpeople_adanna
|-- renderpeople_amit
|-- ......
|-- renderpeople_mixamo_labels_old.pkl
|-- ......
-
Download the pre-trained ViT model from here and put it under
./pretrained_models/
. -
Modify the config file
configs/egofullbody/fisheye_vit/undistort_vit_heatmap_3d.py
. Modify paths in line: 1, 19, 28, 29, 149, 150. -
Modify the paths between line 22-35 in file
mmpose\datasets\datasets\egocentric\mocap_studio_dataset.py
to the paths of SceneEgo test dataset. -
Train the model:
tools/python_train.sh configs/egofullbody/fisheye_vit/undistort_vit_heatmap_3d.py
- Modify the paths in config file
configs/egofullbody/fisheye_vit/undistort_vit_heatmap_3d_finetune_size_0.2_better_init.py
. - Modify the paths in file:
mmpose\datasets\datasets\egocentric\mocap_studio_finetune_dataset.py
to the SceneEgo training dataset. - Finetune the model:
tools/python_train.sh configs/egofullbody/fisheye_vit/undistort_vit_heatmap_3d_finetune_size_0.2_better_init.py
- Download the hand4whole model from here (in this github repo).
- To finetune the model on SceneEgo:
- Modify the paths in config file
configs/egofullbody/egohand/hands4whole_train.py
. tools/python_train.sh configs/egofullbody/egohand/hands4whole_train.py
- Modify the paths in config file
- To finetune the model on EgoWholeBody:
- Modify the paths in config file
configs/egofullbody/egohand/hands4whole_train_synthetic.py
. tools/python_train.sh configs/egofullbody/egohand/hands4whole_train_synthetic.py
- Modify the paths in config file
- For testing on SceneEgo dataset, see
configs/egofullbody/egohand/hands4whole_test_finetuned.py
. - For testing on synthetic EgoWholeBody dataset, see
configs/egofullbody/egofullbody_test_synthetic_fisheye.py
.
Since we are using mmpose as the training and evaluating framework, please see get_started.md for the basic usage of MMPose. There are also tutorials:
- learn about configs
- finetune model
- add new dataset
- customize data pipelines
- add new modules
- export a model to ONNX
- customize runtime settings
If you find this project useful in your research, please consider cite:
@inproceedings{wang2024egocentric,
title={Egocentric whole-body motion capture with fisheyevit and diffusion-based motion refinement},
author={Wang, Jian and Cao, Zhe and Luvizon, Diogo and Liu, Lingjie and Sarkar, Kripasindhu and Tang, Danhang and Beeler, Thabo and Theobalt, Christian},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={777--787},
year={2024}
}
This project is released under the Apache 2.0 license.