Egocentric Whole-Body Motion Capture with FisheyeViT and Diffusion-Based Motion Refinement

This is the official implementation of the paper:

Wang, Jian et al. "Egocentric Whole-Body Motion Capture with FisheyeViT and Diffusion-Based Motion Refinement." CVPR. (2024).

[Project Page] <-- Everything will be updated here.

[EgoWholeBody Training Dataset]

[EgoWholeBody Test Dataset]

[SceneEgo Hand Annotations]

Installation

We base our code on the 0.x version of MMPose.

First create the conda environment and activate it:

conda create -n egowholemocap python=3.9 -y
conda activate egowholemocap

Then install the pytorch version (tested on python 1.3.x) that matches your CUDA version. For example, if you have CUDA 11.7, you can install pytorch 1.13.1 with the following command:

conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia

Install this project:

pip install openmim
mim install mmcv-full==1.7.1
pip install -e .

Install the dependencies of this project:

pip3 install -r requirements.txt

If smplx installed open3d-python, you should uninstall it by running:

pip uninstall open3d-python

Change the torchgeometry code following this issue.
Finally, download the mano hand model, then put it under ./human_models/. The structure of ./human_models/ should be like this:

human_models
|-- mano
|-- |-- MANO_RIGHT.pkl
|-- |-- MANO_LEFT.pkl

Run the demo

1. Download the pretrained models

Download the pretrained human body pose estimation model (FisheyeViT + pixel-aligned 3D heatmap) from NextCloud and put it under ./checkpoints/.
Download the pretrained hand detection model from NextCloud and put it under ./checkpoints/.
Download the pretrained hand pose estimation model from NextCloud and put it under ./checkpoints/.
Download the pretrained whole-body motion diffusion model from NextCloud and put it under ./checkpoints/.

2. Prepare the data

The input data should be a image sequence in directory ./demo/resources/imgs/.

For example, you can download the example sequence from NextCloud, unzip the file and put it under ./demo/resources/.

3. Run the single-frame whole-body pose estimation method

tools/python_test.sh configs/egofullbody/egowholebody_single_demo.py none

The result data will be saved in ./work_dirs/egowholebody_single_demo.

(Optional) Visualize the result

Note: the headless server is not supported.

python scripts/visualization_script/vis_single_frame_whole_body_result.py \
        --pred_path work_dirs/egowholebody_single_demo/outputs.pkl \
        --image_id 0

4. Run the diffusion-based whole-body motion refinement method

python demo/demo_whole_body_diffusion.py \
 --pred_path work_dirs/egowholebody_single_demo/outputs.pkl

The result will be saved in ./work_dirs/egowholebody_diffusion_demo.

(Optional) Visualize the result

Note: the headless server is not supported.

python scripts/visualization_script/vis_diffusion_whole_body_result.py \
        --pred_path work_dirs/egowholebody_diffusion_demo/outputs.pkl \
        --image_id 0

Training

Prepare the training data

Download the EgoWholeBody synthetic dataset from NextCloud.
Unzip all of the files. The file structure should be like this:

path_to_dataset_dir
 |-- renderpeople_adanna
 |-- renderpeople_amit
 |-- ......
 |-- renderpeople_mixamo_labels_old.pkl
 |-- ......

Train the single-frame pose estimation model

Download the pre-trained ViT model from here and put it under ./pretrained_models/.
Modify the config file configs/egofullbody/fisheye_vit/undistort_vit_heatmap_3d.py. Modify paths in line: 1, 19, 28, 29, 149, 150.
Modify the paths between line 22-35 in file mmpose\datasets\datasets\egocentric\mocap_studio_dataset.py to the paths of SceneEgo test dataset.
Train the model:

tools/python_train.sh configs/egofullbody/fisheye_vit/undistort_vit_heatmap_3d.py

Finetune the single-frame pose estimation model on the SceneEgo training dataset

Modify the paths in config file configs/egofullbody/fisheye_vit/undistort_vit_heatmap_3d_finetune_size_0.2_better_init.py.
Modify the paths in file: mmpose\datasets\datasets\egocentric\mocap_studio_finetune_dataset.py to the SceneEgo training dataset.
Finetune the model:

tools/python_train.sh configs/egofullbody/fisheye_vit/undistort_vit_heatmap_3d_finetune_size_0.2_better_init.py

Finetune the single-frame hand pose estimation model

Download the hand4whole model from here (in this github repo).
To finetune the model on SceneEgo:
- Modify the paths in config file configs/egofullbody/egohand/hands4whole_train.py.
- tools/python_train.sh configs/egofullbody/egohand/hands4whole_train.py
To finetune the model on EgoWholeBody:
- Modify the paths in config file configs/egofullbody/egohand/hands4whole_train_synthetic.py.
- tools/python_train.sh configs/egofullbody/egohand/hands4whole_train_synthetic.py

Test the trained single-frame hand pose estimation model

For testing on SceneEgo dataset, see configs/egofullbody/egohand/hands4whole_test_finetuned.py.
For testing on synthetic EgoWholeBody dataset, see configs/egofullbody/egofullbody_test_synthetic_fisheye.py.

How to modify

Since we are using mmpose as the training and evaluating framework, please see get_started.md for the basic usage of MMPose. There are also tutorials:

Citation

If you find this project useful in your research, please consider cite:

@inproceedings{wang2024egocentric,
  title={Egocentric whole-body motion capture with fisheyevit and diffusion-based motion refinement},
  author={Wang, Jian and Cao, Zhe and Luvizon, Diogo and Liu, Lingjie and Sarkar, Kripasindhu and Tang, Danhang and Beeler, Thabo and Theobalt, Christian},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={777--787},
  year={2024}
}

License

This project is released under the Apache 2.0 license.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.circleci		.circleci
.dev_scripts		.dev_scripts
configs		configs
dataset_files		dataset_files
demo		demo
mmpose		mmpose
pretrained_models		pretrained_models
scripts		scripts
tests		tests
tools		tools
.gitignore		.gitignore
.owners.yml		.owners.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
.readthedocs.yml		.readthedocs.yml
CITATION.cff		CITATION.cff
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
associative_embedding_hrnet_w32_coco_512x512.py		associative_embedding_hrnet_w32_coco_512x512.py
index.html		index.html
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Egocentric Whole-Body Motion Capture with FisheyeViT and Diffusion-Based Motion Refinement

Installation

Run the demo

1. Download the pretrained models

2. Prepare the data

3. Run the single-frame whole-body pose estimation method

(Optional) Visualize the result

4. Run the diffusion-based whole-body motion refinement method

(Optional) Visualize the result

Training

Prepare the training data

Train the single-frame pose estimation model

Finetune the single-frame pose estimation model on the SceneEgo training dataset

Finetune the single-frame hand pose estimation model

Test the trained single-frame hand pose estimation model

How to modify

Citation

License

About

Releases

Packages

Languages

License

jianwang-mpi/egowholemocap

Folders and files

Latest commit

History

Repository files navigation

Egocentric Whole-Body Motion Capture with FisheyeViT and Diffusion-Based Motion Refinement

Installation

Run the demo

1. Download the pretrained models

2. Prepare the data

3. Run the single-frame whole-body pose estimation method

(Optional) Visualize the result

4. Run the diffusion-based whole-body motion refinement method

(Optional) Visualize the result

Training

Prepare the training data

Train the single-frame pose estimation model

Finetune the single-frame pose estimation model on the SceneEgo training dataset

Finetune the single-frame hand pose estimation model

Test the trained single-frame hand pose estimation model

How to modify

Citation

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages