Dexbotic Benchmark

A unified robot benchmarking framework that supports automated evaluation of CALVIN, LIBERO, Simpler, RoboTwin 2.0, and ManiSkill2 environments.

Overview

Dexbotic Benchmark provides a comprehensive evaluation framework for robotic learning algorithms across multiple environments:

CALVIN: A large-scale dataset and benchmark for learning long-horizon manipulation tasks
LIBERO: A benchmark for learning robotic manipulation from human demonstrations
Simpler: A framework for evaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo) in simulation under common setups (e.g., Google Robot, WidowX+Bridge)
RoboTwin 2.0: A scalable data generator and benchmark with strong domain randomization for robust bimanual robotic manipulation
ManiSkill2: A benchmark for generalizable manipulation skill learning with diverse tasks and robot embodiments

Quick Start

Prerequisites

System Requirements:

A machine equipped with an NVIDIA GPU (single GPU recommended; tested on 2080Ti, A100, H100, and 4090)
Docker with GPU support

# Clone the repository
git clone https://github.com/Dexmal/dexbotic-benchmark.git
cd dexbotic-benchmark

# Initialize submodules
git submodule update --init --recursive

🐳 Docker (Recommended)

For users who prefer containerized deployment, you can use Docker to run the evaluation environments:

docker pull dexmal/dexbotic_benchmark

Run with Docker

Important Note: The Docker image serves as a client that requires a separate dexbotic model server to be running. Make sure you have the dexbotic model server started before running the evaluation commands.

# Run CALVIN evaluation
docker run --gpus all --network host -v $(pwd):/workspace \
  dexbotic-benchmark \
  bash /workspace/scripts/env_sh/calvin.sh /workspace/evaluation/configs/calvin/example_cavin.yaml

# Run LIBERO evaluation
docker run --gpus all --network host -v $(pwd):/workspace \
  dexmal/dexbotic_benchmark \
  bash /workspace/scripts/env_sh/libero.sh /workspace/evaluation/configs/libero/example_libero.yaml

# Run Simpler evaluation
docker run --gpus all --network host -v $(pwd):/workspace\
  -e NVIDIA_VISIBLE_DEVICES=all -e NVIDIA_DRIVER_CAPABILITIES=all \
  dexmal/dexbotic_benchmark \
  bash scripts/env_sh/simpler.sh evaluation/configs/simpler/example_simpler.yaml

# Run RoboTwin evaluation
# Note: You need to download the RoboTwin assets and mount them to the container (ref: https://robotwin-platform.github.io/doc/usage/robotwin-install.html#4-download-assets-robotwin-od-texture-library-and-embodiments)
docker run --gpus all --network host \
  -v [path/to/assets]:[path/to/assets] \
  -v [path/to/assets]:/app/assets \
  -v [path/to/assets]:/app/RoboTwin/assets \
  -v $(pwd)/evaluation:/app/evaluation \
  -v $(pwd)/scripts:/app/scripts \
  -v $(pwd)/result_test:/app/result_test \
  -e NVIDIA_DRIVER_CAPABILITIES=compute,utility,graphics \
  dexmal/dexbotic_benchmark \
  bash scripts/env_sh/robotwin2.sh evaluation/configs/robotwin2/example_robotwin2.yaml

# Run ManiSkill2 evaluation
docker run --gpus all --network host -v $(pwd):/workspace \
  dexbotic-benchmark \
  python evaluation/run_maniskill2_evaluation.py --config evaluation/configs/maniskill2/example_maniskill2.yaml

Note: RoboTwin2.0 has 50 sub-tasks, and each sub-task has two levels of difficulty. According to the official setting of RoboTwin2.0, each subtask needs to be evaluated separately. You can modify the task_name and task_config parameters in the configuration file to select different subtasks and difficulty levels for evaluation. ref: https://robotwin-platform.github.io/leaderboard

Viewing Results

After running the Docker commands, evaluation results will be saved in the location specified by the output_dir parameter in your configuration file. For example:

Results Location: Check the output_dir field in your configuration file (e.g., evaluation/configs/calvin/example_cavin.yaml)
Default Output: Results are typically saved in ./result_test/ directory by default
Log Files: Console output contains detailed evaluation progress and result information
Configuration Files: Evaluation configuration files are located in evaluation/configs/ directory

Local Installation

For detailed local installation instructions, please refer to the comprehensive guide in docs/local_install.md.

Contributing

We welcome contributions to improve the Dexbotic Benchmark framework. Please feel free to submit issues and pull requests.

License

This project is licensed under the terms specified in the LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
RoboTwin @ daef39a		RoboTwin @ daef39a
calvin @ fa03f01		calvin @ fa03f01
docs		docs
evaluation		evaluation
libero @ 8f1084e		libero @ 8f1084e
maniskill2		maniskill2
scripts		scripts
simpler @ 4ab7178		simpler @ 4ab7178
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Dexbotic Benchmark

Overview

Quick Start

Prerequisites

🐳 Docker (Recommended)

Run with Docker

Viewing Results

Local Installation

Contributing

License

About

Uh oh!

Releases

Packages

Languages

License

Dexmal/dexbotic-benchmark

Folders and files

Latest commit

History

Repository files navigation

Dexbotic Benchmark

Overview

Quick Start

Prerequisites

🐳 Docker (Recommended)

Run with Docker

Viewing Results

Local Installation

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages