Aligning Motion Generation with Human Perceptions

This repository contains the PyTorch implementation of the paper "Aligning Motion Generation with Human Perceptions".

Quick Demo

MotionCritic is capable of scoring a single motion with just a few lines of code.

bash prepare/prepare_smpl.sh

from lib.model.load_critic import load_critic
from parsedata import into_critic
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
critic_model = load_critic("critic/motioncritic_pre.pth", device)
example = torch.load("criexample.pth", map_location=device)
example_motion = example['motion'] # [bs, 25, 6, frame], rot6d with 24 SMPL joints and 1 XYZ root location
# motion pre-processing
preprocessed_motion = into_critic(example['motion']) # [bs, frame, 25, 3], axis-angle with 24 SMPL joints and 1 XYZ root location
# critic score
critic_scores = critic_model.module.batch_critic(preprocessed_motion)
print(f"critic scores are {critic_scores}") # Critic score being 4.1297 in this case

criexample.mp4

Try scoring multiple motions with some more code

bash prepare/prepare_demo.sh

from lib.model.load_critic import load_critic
from render.render import render_multi
from parsedata import into_critic
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
critic_model = load_critic("critic/motioncritic_pre.pth", device)
example = torch.load("visexample.pth", map_location=device)
# calculate critic score
critic_scores = critic_model.module.batch_critic(into_critic(example['motion']))
print(f"critic scores are {critic_scores}")
# rendering
render_multi(example['motion'], device, example['comment'], example['path'])

demo.mp4

Getting Started

Setup the Environment

conda env create -f environment.yml
conda activate mocritic

Task Documentation

Dataset & Pretrained Model

Download the pre-processed datasets and pretrained models:

bash prepare/prepare_dataset.sh  # Download pre-processed datasets
bash prepare/prepare_pretrained.sh  # Download pretrained models

Alternatively, you can manually download the files from the following links:

Pre-processed datasets: Google Drive Link
Pretrained MotionCritic model: Google Drive Link

Build Your Own Dataset (Optional)

To build your own dataset from the original motion files and annotation results:

bash prepare/prepare_fullannotation.sh
bash prepare/prepare_fullmotion.sh

Manual downloads are available here:

Full annotation results: Google Drive Link
Complete motion .npz files: Google Drive Link

After pre-processing the complete data, build your dataset with:

cd MotionCritic
python parsedata.py

Evaluating the Critic Model

Reproduce the results from the paper by running:

cd MotionCritic/metric
python metrics.py
python critic_score.py

Train Your Critic Model

Train your own critic model with the following command:

cd MotionCritic
python train.py --gpu_indices 0 --exp_name my_experiment --dataset mdmfull_shuffle --save_latest --lr_decay --big_model

Critic Model Supervised Fine-Tuning

First, prepare the MDM baseline:

bash prepare/prepare_MDM_dataset.sh
bash prepare/prepare_MDM_pretrained.sh

If you encounter any issues, refer to the MDM baseline setup.

Next, start MotionCritic-supervised fine-tuning:

cd MDMCritic

python -m train.tune_mdm \
--dataset humanact12 --cond_mask_prob 0 --lambda_rcxyz 1 --lambda_vel 1 --lambda_fc 1 \
--resume_checkpoint ./save/humanact12/model000350000.pt \
--reward_model_path ./reward/motioncritic_pre.pth \
--device 0 \
--num_steps 1200 \
--save_interval 100 \
--reward_scale 1e-4 --kl_scale 5e-2 --random_reward_loss \
--ddim_sampling \
--eval_during_training \
--sample_when_eval \
--batch_size 64 --lr 1e-5 \
--denoise_lower 700 --denoise_upper 900 \
--use_kl_loss \
--save_dir save/finetuned/my_experiment \
--wandb my_experiment

Additional Python scripts for various fine-tuning purposes can be found in MDMCritic/train, detailed in the fine-tuning documentation.

Citation

If you find our work useful for your project, please consider citing the paper:

@article{motioncritic2024,
    title={Aligning Motion Generation with Human Perceptions},
    author={Wang, Haoru and Zhu, Wentao and Miao, Luyi and Xu, Yishu and Gao, Feng and Tian, Qi and Wang, Yizhou},
    journal={arXiv preprint arXiv:2407.02272},
    year={2024}
}

Acknowledgement

If you use MotionPercept and MotionCritic in your work, please also cite the original datasets and methods on which our work is based.

MDM:

@inproceedings{
  tevet2023human,
  title={Human Motion Diffusion Model},
  author={Guy Tevet and Sigal Raab and Brian Gordon and Yoni Shafir and Daniel Cohen-or and Amit Haim Bermano},
  booktitle={The Eleventh International Conference on Learning Representations },
  year={2023},
  url={https://openreview.net/forum?id=SJ1kSyO2jwu}
}

HumanAct12:

@inproceedings{guo2020action2motion,
  title={Action2motion: Conditioned generation of 3d human motions},
  author={Guo, Chuan and Zuo, Xinxin and Wang, Sen and Zou, Shihao and Sun, Qingyao and Deng, Annan and Gong, Minglun and Cheng, Li},
  booktitle={Proceedings of the 28th ACM International Conference on Multimedia},
  pages={2021--2029},
  year={2020}
}

FLAME:

@inproceedings{kim2023flame,
  title={Flame: Free-form language-based motion synthesis \& editing},
  author={Kim, Jihoon and Kim, Jiseob and Choi, Sungjoon},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={37},
  number={7},
  pages={8255--8263},
  year={2023}
}

UESTC:

@inproceedings{ji2018large,
  title={A large-scale RGB-D database for arbitrary-view human action recognition},
  author={Ji, Yanli and Xu, Feixiang and Yang, Yang and Shen, Fumin and Shen, Heng Tao and Zheng, Wei-Shi},
  booktitle={Proceedings of the 26th ACM international Conference on Multimedia},
  pages={1510--1518},
  year={2018}
}

DSTFormer:

@inproceedings{zhu2023motionbert,
  title={Motionbert: A unified perspective on learning human motion representations},
  author={Zhu, Wentao and Ma, Xiaoxuan and Liu, Zhaoyang and Liu, Libin and Wu, Wayne and Wang, Yizhou},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={15085--15099},
  year={2023}
}

SMPL:

@incollection{loper2023smpl,
  title={SMPL: A skinned multi-person linear model},
  author={Loper, Matthew and Mahmood, Naureen and Romero, Javier and Pons-Moll, Gerard and Black, Michael J},
  booktitle={Seminal Graphics Papers: Pushing the Boundaries, Volume 2},
  pages={851--866},
  year={2023}
}

We also recommend exploring other motion metrics, including PoseNDF, NPSS, NDMS, MoBERT, and PFC. You can also check out a survey of different motion generation metrics, datasets, and approaches.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
MDMCritic		MDMCritic
MotionCritic		MotionCritic
docs		docs
prepare		prepare
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aligning Motion Generation with Human Perceptions

Quick Demo

Getting Started

Setup the Environment

Task Documentation

Dataset & Pretrained Model

Build Your Own Dataset (Optional)

Evaluating the Critic Model

Train Your Critic Model

Critic Model Supervised Fine-Tuning

Citation

Acknowledgement

About

Releases

Packages

Contributors 2

Languages

ou524u/MotionCritic

Folders and files

Latest commit

History

Repository files navigation

Aligning Motion Generation with Human Perceptions

Quick Demo

Getting Started

Setup the Environment

Task Documentation

Dataset & Pretrained Model

Build Your Own Dataset (Optional)

Evaluating the Critic Model

Train Your Critic Model

Critic Model Supervised Fine-Tuning

Citation

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages