This is the official code release for our paper "KMOPS: Keypoint-Driven Method for Multi-Object Pose and Metric Size Estimation from Stereo Images". [Paper] [Project Page]
-
Build Docker Image
docker build -t kmops docker/This docker has been tested on:
NVIDIA GeForce RTX 4090 / Driver version: 535.230.02 / CUDA version: 12.2 -
Run Docker container
./run_container.shMake sure the
$WORKSPACE_DIRvariable in run_container.sh specify the path where you store this repository before running the following command. -
Run inference
python demo.pyThis code is self-contained and will automatically download the model weights on first run. If the download fails, you can manually download the weights from here and then run the script again.
-
We utilized StereOBJ-1M and Keypose in our experiments. Please follow their instructions and prepare the structure as follows:
-
StereOBJ-1M
data/stereobj_1m/ ├── images_annotations/ ← scene image folders │ ├── biolab_scene_10_08212020_1/ │ │ ├── 000000.jpg │ │ └── … │ ├── biolab_scene_10_08212020_2/ │ │ ├── 000000.jpg │ │ └── … │ ├── biolab_scene_10_08212020_3/ │ │ ├── 000000.jpg │ │ └── … │ └── … ├── objects/ ← per-object bbox files │ ├── blade_razor.bbox │ └── … ├── split/ ← train/val/test splits │ ├── biolab_object_list.txt │ └── … ├── camera.json ← camera intrinsics/extrinsics └── val_label_merged.json ← merged validation labels -
TOD
data/tod/ ├── objects ← folders that save object mesh files. ├── bottle_0/ │ ├── texture_0_pose_0/ ← folders that save images and annotations │ ├── texture_0_pose_1/ │ ├── texture_0_pose_2/ │ └── … ├── mug_0/ │ ├── texture_0_pose_0/ ← folders that save images and annotations │ ├── texture_0_pose_1/ │ ├── texture_0_pose_2/ │ └── … ├── cup_0/ │ ├── texture_0_pose_0/ ← folders that save images and annotations │ ├── texture_0_pose_1/ │ ├── texture_0_pose_2/ │ └── … └── …
-
-
Convert data into .pkl files
python tools/convert_stereobj.pypython tools/convert_keypose.pyFor more datails on how the data in pkl files are organized and how to prepare for custom dataset, please refer to dataset/custom_dataset.md.
-
Specify .pkl path in configs
For example in conf/train_stereobj.yaml, specify the paths as shown in the following:
dataset: train_pkl: "data/stereobj_train.pkl" val_pkl: "data/stereobj_val.pkl"Note that you can include multiple .pkl files in a list. For example:
dataset: train_pkl: [ "data/stereobj_train_set1.pkl", "data/stereobj_train_set2.pkl" ] val_pkl: "data/stereobj_val.pkl"
We use the Hydra library to manage configurations. For more information, please refer to the Hydra documentation. Training configs are stored in conf/
python train.py --config-name train_stereobj wandb_online=False run_name=8keypoints_stereobj_tempt1
Everything required for evaluation will be automatically saved in the folder that saves wandb logging data during training to ensure evaluation uses exactly the same settings as training.
python evaluate.py --wandb_folder 8keypoints_o2o_stereobj
The main structure of the code are adapted and modified from DETR, RTDETR, and GroupPose.
Our evaluation code are directly borrowed from Ultralytics and SPD.
If you use this code for your research, please cite:
@InProceedings{Wu_2026_WACV,
author = {Wu, Ying-Kun and Shen, Yi and Huang, Tzuhsuan and Fang, I-Sheng and Chen, Jun-Cheng},
title = {KMOPS: Keypoint-Driven Method for Multi-Object Pose and Metric Size Estimation from Stereo Images},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {March},
year = {2026},
pages = {7730-7739}
}
