Fabien Baradel*,
Matthieu Armando,
Salma Galaaoui,
Romain Brégier,
Philippe Weinzaepfel,
Grégory Rogez,
Thomas Lucas*
ECCV'24
* equal contribution
Multi-HMR is a simple yet effective single-shot model for multi-person and expressive human mesh recovery. It takes as input a single RGB image and efficiently performs 3D reconstruction of multiple humans in camera space.
- 2024/07/03: Release of training-evaluation code.
- 2024/07/01: Multi-HMR is accepted to ECCV'24.
- 2024/06/17: Multi-HMR won Robin Challenge @CVPR'24: 3D human reconstruction track.
- 2024/02/22: Release of demo code.
First, you need to clone the repo.
We recommand to use virtual enviroment for running MultiHMR.
Please run the following lines for creating the environment with venv
:
python3.9 -m venv .multihmr
source .multihmr/bin/activate
pip install -r requirements.txt
Otherwise you can also create a conda environment.
conda env create -f conda.yaml
conda activate multihmr
The installation has been tested with python3.9 and CUDA 12.1.
Checkpoints will automatically be downloaded to $HOME/models/multiHMR
the first time you run the demo code.
Besides these files, you also need to download the SMPLX model.
You will need the neutral model for running the demo code.
Please go to the corresponding website and register to get access to the downloads section.
Download the model and place SMPLX_NEUTRAL.npz
in ./models/smplx/
.
The following command will run Multi-HMR on all images in the specified --img_folder
, and save renderings of the reconstructions in --out_folder
.
The --model_name
flag specifies the model to use.
The --extra_views
flags additionally renders the side and bev view of the reconstructed scene, --save_mesh
saves meshes as in a '.npy' file.
python3.9 demo.py \
--img_folder example_data \
--out_folder demo_out \
--extra_views 1 \
--model_name multiHMR_896_L
We provide multiple pre-trained checkpoints.
Here is a list of their associated features.
Once downloaded you need to place them into $HOME/models/multiHMR
.
modelname | training data | backbone | resolution | runtime (ms) | PVE-3PDW-test | PVE-EHF | PVE-BEDLAM-val | comment |
---|---|---|---|---|---|---|---|---|
multiHMR_896_L HuggingFace model | BEDLAM+AGORA+CUFFS+UBody | ViT-L | 896x896 | 126 | 89.9 | 42.2 | 56.7 | initial ckpt |
multiHMR_672_L | BEDLAM+AGORA+CUFFS+UBody | ViT-L | 672x672 | 74 | 94.1 | 37.0 | 58.6 | longer training |
multiHMR_672_B | BEDLAM+AGORA+CUFFS+UBody | ViT-B | 672x672 | 43 | 94.0 | 43.6 | 67.2 | longer training |
multiHMR_672_S | BEDLAM+AGORA+CUFFS+UBody | ViT-S | 672x672 | 29 | 102.4 | 49.3 | 78.9 | longer training |
multiHMR_1288_L_bedlam | BEDLAM(train+val) | ViT-L | 1288x1288 | ? | ? | ? | ckpt used for BEDLAM leaderboard | |
multiHMR_1288_L_agora | BEDLAM(train+val)+AGORA(train+val) | ViT-L | 1288x1288 | ? | ? | ? | ckpt used for AGORA leaderboard |
We compute the runtime on GPU V100-32GB.
We provide code for training Multi-HMR using a single GPU on BEDLAM-training and evaluating it on BEDLAM-validation, EHF and 3DPW-test.
Activate environnement
source .multihmr/bin/activate
export PYTHONPATH=`pwd`
The first thing that you need to do is to download the BEDLAM dataset (6fps version) and place the files into data/BEDLAM
The data structure of the directory should look like this:
data/BEDLAM
|
|---validation
|
|---20221018_1_250_batch01hand_zoom_suburb_b_6fps
|
|---png
|
|---seq_000000
|
|---seq_000000_0000.png
...
|---seq_000000_0235.png
...
|---seq_000249
...
|---20221019_3-8_250_highbmihand_orbit_stadium_6fps
|---training
|
|---20221010_3_1000_batch01hand_6fps
...
|---20221024_3-10_100_batch01handhair_static_highSchoolGym_30fps
|---all_npz_12_training
|
|---20221010_3_1000_batch01hand_6fps.npz
...
|---20221024_3-10_100_batch01handhair_static_highSchoolGym_30fps.npz
|---all_npz_12_validation
|
|---20221018_1_250_batch01hand_zoom_suburb_b_6fps.npz
...
|---20221019_3-8_250_highbmihand_orbit_stadium_6fps.npz
We need to build the annotation files for the training and validation sets. It may takes around 20 minutes for bulding the pkl files depending on your CPU.
python3.9 datasets/bedlam.py "create_annots(['validation', 'training'])"
You will get two files data/bedlam_validation.pkl
and data/bedlam_training.pkl
.
Visualize the annotation of a specific image.
python3.9 datasets/bedlam.py "visualize(split='validation', i=1500)"
It will create a file bedlam_validation_15000.jpg
where you can see the RGB image on the left side and the RGB image with meshes overlayed on the right side.
BEDLAM is composed of PNG files and loading them could be a bit slow depending our your infrastucture. The following command will generate one jpg file for each png file with maximal resolution of 1280. It may take a while because BEDLAM has more than 300k images. You can run the command lines on some specific subdirectories to speed-up the generation of jpg files. You can chose the target size of your choice.
# Can be slow
python3.9 datasets/bedlam.py "create_jpeg(root_dir='data/BEDLAM', target_size=1280)
# Or parallelize
python3.9 datasets/bedlam.py "create_jpeg(root_dir='data/BEDLAM/validation/20221019_3-8_250_highbmihand_orbit_stadium_6fps', target_size=1280)
...
python3.9 datasets/bedlam.py "create_jpeg(root_dir='data/BEDLAM/training/20221010_3-10_500_batch01hand_zoom_suburb_d_6fps', target_size=1280)
You can check the quality of your dataloader by running the command above. It will use the png version of BEDLAM.
python3.9 datasets/bedlam.py "dataloader(split='validation', batch_size=16, num_workers=4, extension='png', img_size=1280, n_iter=100)"
We also provide code for evaluating on EHF and 3DPW. Run the command for bulding the annotation fiel for EHF.
python3.9 datasets/ehf.py "create_annots()"
python3.9 datasets/ehf.py "visualize(i=10)"
And for 3DPW. Please download SMPL-male and SMPL-female models, put them into models/smpl/SMPL_MALE.pkl
and models/smpl/SMPL_FEMALE.pkl
. And smplx2smpl.pkl
is mandatory for moving from SMPLX to SMPL.
python3.9 datasets/threedpw.py "create_annots()"
python3.9 datasets/threedpw.py "visualize(i=1011)"
We provide the command for training on BEDLAM-train at resolution 336 on a single GPU.
# python command
CUDA_VISIBLE_DEVICES=1 python3.9 train.py \
--backbone dinov2_vits14 \
--img_size 336 \
-j 4 \
--batch_size 32 \
-iter 10000 \
--max_iter 500000 \
--name multi-hmr_s_336
To decrease data-loading time use --extension jpg --res 1280
Above command is for evaluating a pretrained ckpt on validation sets.
CUDA_VISIBLE_DEVICES=0 python3.9 train.py \
--eval_only 1 \
--backbone dinov2_vitl14 \
--img_size 896 \
--val_data EHF THREEDPW BEDLAM \
--val_split test test validation \
--val_subsample 1 20 25 \
--pretrained models/multiHMR/multiHMR_896_L.pt
Either check the log or open the tensorboard for checking the results.
The Close-Up Frames of Full-Body Subjects dataset, containing humans close to the camera with diverse hand poses is available here(LICENSE). More information about how to use it will be given soon, stay tuned.
The code is distributed under the CC BY-NC-SA 4.0 License.
See Multi-HMR LICENSE, Checkpoint LICENSE and Example Data LICENSE for more information.
If you find this code useful for your research, please consider citing the following paper:
@inproceedings{multi-hmr2024,
title={Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot},
author={Baradel*, Fabien and
Armando, Matthieu and
Galaaoui, Salma and
Br{\'e}gier, Romain and
Weinzaepfel, Philippe and
Rogez, Gr{\'e}gory and
Lucas*, Thomas
},
booktitle={ECCV},
year={2024}
}