RHM2DGen: Towards Rich Human Motion2D Generation.

framework

results

dataset

Our model was trained on the following datasets (~1.5M samples), with the top and bottom 5% excluded from training for validation/testing purposes:

skeleton_trainv8_flux_0.txt_new.txt # Flux-generated data
skeleton_trainv8_flux_1.txt_new.txt # Flux-generated data
skeleton_trainv8_pinterest.txt_new.txt # Web-crawled data
skeleton_trainv8_reelshort.txt_new.txt # Vertical short-video data
skeleton_trainv8_vcg_0.txt_new.txt # Web-crawled data
skeleton_trainv8_vcg_1.txt_new.txt # Web-crawled data
skeleton_trainv8_vcg_2.txt_new.txt # Web-crawled data
skeleton_trainv8_vcg_3.txt_new.txt # Web-crawled data

We additionally incorporated skeleton data from the COYO open-source dataset, though these were not used in final model iterations due to project adjustments:

coyo_two_people_0.txt
coyo_two_people_1.txt
coyo_two_people_2.txt
coyo_two_people_3.txt
coyo_two_people_4.txt
coyo274w_0.txt
coyo274w_1.txt
coyo274w_2.txt
coyo274w_3.txt
coyo274w_4.txt
coyo274w_5.txt
coyo274w_6.txt
coyo274w_7.txt
coyo274w_8.txt
coyo274w_9.txt

Dataset Processing

Dataset labels were generated using complex prompts + GPT-4o, with multi-dimensional annotations (see script: tools/gen_caption.py)
During training, descriptions from different dimensions are randomly combined to create richer text distributions
We also provide simple single-sentence prompts generated via gen_prompt_simple in tools/gen_caption.py
These simple prompts were used to train our evaluation model: RHM2DGen_eval
Pipeline: Face/human detection → Region extraction using detection boxes + SAM (resolving overlaps in multi-person cases) → Skeleton extraction and subsequent labeling using SAM masks
Note: To ensure GPT-4o can recognize character relationships and maintain description consistency in multi-person scenarios, we input each person's SAM detection region to define character names (subject0/1) while maintaining whole-image context for detail description

Evaluation Datasets

(We provide original images, JSON files, and test prompts)

eval_single_1k
eval_double_1k
For image data requests (non-commercial use only), please email: wxktongji@163.com
Download link: Baidu Drive (Code: 4ujq)

Environment

environment.yml

Train

python3 train.py

Infer

python3 infer.py

Model Download

Download link: Baidu Drive (Code: rvgs)

Evaluation Model

Code located in RHM2DGen_eval/, developed based on MDM's evaluation framework (modified for skeleton points)
Environment: RHM2DGen_eval/environment.yml
prompt: tools/gen_caption.py, using simple prompts (gen_prompt_simple)
Training script: RHM2DGen_eval/train.py
Evaluation script: RHM2DGen_eval/eval.py

Eval model

double characters eval model: https://pan.baidu.com/s/13AQyPMiAyv56-xVDBJ165A?pwd=y2wz 提取码: y2wz
single character eval model: https://pan.baidu.com/s/1_FOAq0STq74r4UVOt8w2gw?pwd=dsuf 提取码: dsuf

Technical Report

Detailed technical report will be released subsequently.

Contribution

Primary contributors: Xuekuan Wang, Haoyu Yin, Haoyu Zheng, Yuqiu Huang, Keqiang Sun, Feng Qiu, Yunhao Shui, Junru Qiu

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
RHM2DGen_eval		RHM2DGen_eval
assets		assets
benchmark		benchmark
common		common
configs		configs
datasets		datasets
models		models
prepare		prepare
readme		readme
t2s_pipeline		t2s_pipeline
tools		tools
utils		utils
.DS_Store		.DS_Store
README.CN.md		README.CN.md
README.md		README.md
environment.yml		environment.yml
infer.py		infer.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RHM2DGen: Towards Rich Human Motion2D Generation.

framework

results

dataset

Dataset Processing

Evaluation Datasets

Environment

Train

Infer

Model Download

Evaluation Model

Eval model

Technical Report

Contribution

About

Uh oh!

Releases

Packages

Languages

ZulutionAI/RHM2DGen

Folders and files

Latest commit

History

Repository files navigation

RHM2DGen: Towards Rich Human Motion2D Generation.

framework

results

dataset

Dataset Processing

Evaluation Datasets

Environment

Train

Infer

Model Download

Evaluation Model

Eval model

Technical Report

Contribution

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages