If you find our work useful, please give us a star β! Your support drives us to keep improving.
π― Project Page β’ π Paper β’ π€ Dataset
[TL;DR] Target-Bench is the first benchmark and dataset for evaluating video world models (WMs) on mapless robotic path planning for semantic targets.
- Fine-tuned checkpoints release
- Fine-tune code release
- Benchmark code release
- Dataset release
- Paper release
- Website launch
- [2026.01] We release the training code for fine-tuning world models!
- [2025.11] We release the Paper, Dataset, and Benchmark Code!
git clone https://github.com/TUM-AVS/target-bench.git
cd target-benchEnsure you have miniconda installed.
You can set up all environments at once or individually. For a quick start with VGGT:
# Install VGGT environment
bash set_env.sh vggtFor other options (installing all environments or specific ones like SpaTracker/ViPE), please refer to docs/env.md.
Download the benchmark_data (scenarios) and wm_videos (generated videos) into the dataset/ directory:
cd dataset
# Download Benchmark scenarios
huggingface-cli download target-bench/benchmark_data --repo-type dataset --local-dir Benchmark --local-dir-use-symlinks False
# Download World Model generated videos
huggingface-cli download target-bench/wm_videos --repo-type dataset --local-dir wm_videos --local-dir-use-symlinks False
cd ..Now, the project directory structure should look like this:
target-bench/
βββ DiffSynth-Studio/ # DiffSynth-Studio for fine-tuning
βββ assets/ # Images and project assets
βββ dataset/ # Benchmark data and generated videos
β βββ Benchmark/ # Benchmark scenarios
β βββ wm_videos/ # Videos generated by world models
βββ evaluation/ # Evaluation scripts and configs
βββ models/ # Source code for evaluated models
β βββ spatracker/
β βββ vggt/
β βββ vipe/
βββ pipelines/ # World decoders adapted for each model
βββ spatracker/
βββ vggt/
βββ vipe/
Run a quick evaluation with 3 scenes using VGGT as the spatial-temporal tool:
conda activate vggt
cd evaluation
python target_eval_vggt.py -n 3 Then you should be able to see the evaluation results and visualizations in the evaluation_results folder:
conda deactivate
conda create -n target-finetune python=3.10 -y
conda activate target-finetune
cd DiffSynth-Studio
pip install -r requirements.txtcd DiffSynth-Studio/models/train
# Download Fine-tuned Checkpoint
huggingface-cli download target-bench/ckpts --repo-type model --local-dir ckpts --local-dir-use-symlinks False
cd ../..huggingface-cli download target-bench/finetune_dataset --repo-type dataset --local-dir dataset --local-dir-use-symlinks False
cd dataset
# data_four_segments_121_frames.zip contains the data augmentation result
unzip data_four_segments_121_frames.zip data_single_segment_121_frames.zip data_inference.zip
cd ..Inference with Checkpoint finetuned using data augmentation.
python run_inference_four_segments_epoch-49_batch.pyFine-tune Wan2.2-TI2V-5B on 325 scenarios using data augmentation result.
bash Wan2.2-TI2V-5B_four_segments.sh@article{wang2025target,
title={Target-Bench: Can World Models Achieve Mapless Path Planning with Semantic Targets?},
author={Wang, Dingrui and Ye, Hongyuan and Liang, Zhihao and Sun, Zhexiao and Lu, Zhaowei and Zhang, Yuchen and Zhao, Yuyu and Gao, Yuan and Seegert, Marvin and Sch{\"a}fer, Finn and others},
journal={arXiv preprint arXiv:2511.17792},
year={2025}
}This project builds upon the following open-source works:
Please refer to their respective directories for detailed credits and license information.

