HSR-data-preprocessing

This repository provides a preprocessing pipeline for monocular videos containing human motion in static scenes.

Given an input video, our pipeline estimates camera poses, reconstructs human poses in world coordinates, and extracts monocular geometric cues (depth and surface normals). The processed data can then be used by HSR to create human-scene reconstructions.

This preprocessing pipeline is maintained as a standalone repository to facilitate its use in other applications beyond HSR.

General pipeline

The pipeline consists of the following sequential steps:

Extract and select sharp frames from a video or an image sequence
Generate human masks
Estimate camera poses
Generate monocular depth and normal maps
Estimate human poses in the camera coordinate frame
Extract human 2D keypoints
Refine human poses with 2D keypoints and temporal smoothness
Align human poses in the world coordinate frame and scale scene to metric units using human body scale
Save processed data in HSR-compatible format

Setup

Clone the repository and its submodules:

git clone https://github.com/lxxue/HSR-data-preprocessing.git --recursive

Setup the environment for Grounded-SAM2 and most of the code in this repository:

conda create -n hsr-data python=3.10
conda activate hsr-data
# SAM2.1 requires torch >=2.5.1
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121

# For Grounded-SAM2
cd third_party/Grouned-SAM-2
cd checkpoints
bash download_ckpts.sh
cd ../
cd gdino_checkpoints
bash download_ckpts.sh
cd ../
export CUDA_HOME="/usr/local/cuda-12.1"
pip install -e .
pip install --no-build-isolation -e grounding_dino
pip install opencv-python supervision transformers addict yapf pycocotools timm

# For hloc
cd ../../
cd third_party/Hierarchical-Localization
git submodule update --init --recursive
pip install -e .
pip install pyquaternion scipy

pip install cython 
pip install simple-romp
pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py310_cu121_pyt251/download.html
pip install smplx open3d

Create a separate environment for Metric3Dv2 following the official instructions.

Build openpose python package following the official guide.

Update python paths in process_data.py and cmd.sh.

# process_data.py
SAM2_PYTHON_PATH = "/home/lixin/miniconda3/envs/sam21/bin/python"
METRIC3D_PYTHON_PATH = "/home/lixin/miniconda3/envs/metric3d/bin/python"
OPENPOSE_PYTHON_PATH = "/usr/bin/python3"
OPENPOSE_MODEL_PATH = "/home/lixin/softwares/openpose/models/"

# cmd.sh
SAM2_PYTHON_PATH="/home/lixin/miniconda3/envs/sam21/bin/python"
METRIC3D_PYTHON_PATH="/home/lixin/miniconda3/envs/metric3d/bin/python"
OPENPOSE_PYTHON_PATH="/usr/bin/python3"
OPENPOSE_MODEL_PATH="/home/lixin/softwares/openpose/models/"

Download SMPL model (version 1.1.0 for Python 2.7 (female/male)) and place them under checkpoints/smpl:

mkdir -p checkpoints/smpl
mv /path_to_smpl_models/basicmodel_f_lbs_10_207_0_v1.1.0.pkl checkpoints/smpl/SMPL_FEMALE.pkl
mv /path_to_smpl_models/basicmodel_m_lbs_10_207_0_v1.1.0.pkl checkpoints/smpl/SMPL_MALE.pkl

Prepare SMPL model files needed by ROMP according to the official instructions and place them under checkpoints/romp:

mkdir -p checkpoints/romp
mv /path_to_romp_models/SMPL_MALE.pth checkpoints/romp/SMPL_MALE.pth
mv /path_to_romp_models/SMPL_FEMALE.pth checkpoints/romp/SMPL_FEMALE.pth

Usage

We provide a python script process_data.py and a shell script run_process_data.sh as examples to process the data.

# Modify the arguments in run_process_data.sh first to fit your data
# Run each step with indices, e.g. 0 1 2 (modify indices as needed)
bash run_process_data.sh 0 1 2

You can also run each step separately by uncommenting the corresponding command in cmd.sh.

bash cmd.sh

Each script contains detailed documentation of its functionality. For example, in select_frames.py:

"""
Frame Selection Utility for Videos and Image Sequences

Arguments:
    --input_path: path to the input video file or a directory of images
    --data_dir: output directory for the processed data 
    --window_size: number of frames to consider in each selection window (default: 10)
    --frame_start: starting frame number to process (default: 0)
    --frame_end: ending frame number (inclusive) to process (default: 1000000)
    --image_resize_factor: factor by which to reduce image size (1, 2, 4, or 8)

Output Structure:
    data_dir/
    ├── images/
    │   ├── all_frames/         # Contains all processed frames
    │   ├── selected_frames/    # Contains selected sharp frames
    │   └── selected_idxs.npy   # Numpy array of selected frame indices
"""

Acknowledgements

This work builds upon several excellent open-source projects. We would like to thank the authors of: Vid2Avatar, NeuMAN, hloc, colmap Metric3D, Grounded-SAM2 openpose, ROMP .

BibTex

If you find this work useful for your research, please consider citing our paper:

@inproceedings{xue2024hsr,
    author={Xue, Lixin and Guo, Chen and Zheng, Chengwei and Wang, Fangjinhua and Jiang, Tianjian and Ho, Hsuan-I and Kaufmann, Manuel and Song, Jie and Hilliges Otmar},
    title={{HSR:} Holistic 3D Human-Scene Reconstruction from Monocular Videos},
    booktitle={European Conference on Computer Vision (ECCV)},
    year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
third_party		third_party
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
align_human_scene.py		align_human_scene.py
cmd.sh		cmd.sh
create_masks_with_sam2.py		create_masks_with_sam2.py
estimate_camera_poses.py		estimate_camera_poses.py
extract_monocular_cues_with_Metric3D.py		extract_monocular_cues_with_Metric3D.py
install_env.sh		install_env.sh
parkinglot.sh		parkinglot.sh
prepare_dataset.py		prepare_dataset.py
process_data.py		process_data.py
refine_romp.py		refine_romp.py
run_openpose.py		run_openpose.py
run_process_data.sh		run_process_data.sh
run_romp.py		run_romp.py
select_frames.py		select_frames.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HSR-data-preprocessing

General pipeline

Setup

Usage

Acknowledgements

BibTex

About

Languages

License

lxxue/HSR-data-preprocessing

Folders and files

Latest commit

History

Repository files navigation

HSR-data-preprocessing

General pipeline

Setup

Usage

Acknowledgements

BibTex

About

Topics

Resources

License

Stars

Watchers

Forks

Languages