CapeNext

This repo is the official implementation of "CapeNext: Rethinking and Refining Dynamic Support Information for Category-Agnostic Pose Estimation" [AAAI-2026]

Introduction

CapeNext is available at arXiv. It's a new framework that innovatively integrates hierarchical cross-modal interaction

with dual-stream feature refinement, enhancing the joint embedding with both class-level and instance-specific cues from

textual description and specific images. Experiments on the MP-100 dataset demonstrate that, regardless of the network

backbone, CapeNext consistently outperforms state-of-the-art CAPE methods by a large margin.

Results on MP-100 dataset

methods	Img Backbone	CLIP Backbone	Split1	Split2	Split3	Split4	Split5	Avg
POMNet	ResNet-50	-	84.23	78.25	78.17	78.68	79.17	79.70
CapeFormer	ResNet-50	-	89.45	84.88	83.59	83.53	85.09	85.31
ESCAPE	ResNet-50	-	86.89	82.55	81.25	81.72	81.32	82.74
MetaPoint+	ResNet-50	-	90.43	85.59	84.52	84.34	85.96	86.17
X-Pose	ResNet-50	ViT-Base-32	89.07	85.05	85.26	85.52	85.79	86.14
SDPNet	HRNet-32	-	91.54	86.72	85.49	85.77	87.26	87.36
GraphCape	Swinv2-T	-	91.19	87.81	85.68	85.87	85.61	87.23
CapeX	HRNet-w32	ViT-Base-32	89.1	85.0	81.9	84.4	85.4	85.2
CapeX	ViT-Base-16	ViT-Base-32	90.75	82.87	83.18	85.95	85.49	85.65
CapeX	DINOv2-ViT-S	ViT-Base-32	90.6	83.74	83.67	86.87	85.93	86.18
CapeX	Swinv2-T	ViT-Base-32	91.9	86.97	84.41	86.13	88.64	87.61
CapeNext	HRNet-w32	ViT-Base-32	90.2	86.0	82.9	85.4	87.1	86.3
CapeNext	ViT-Base-16	ViT-Base-32	90.84	86.73	86.5	82.44	87.91	86.88
CapeNext	DINOv2-ViT-S	ViT-Base-32	92.12	87.75	83.76	87.16	88.95	87.95
CapeNext	Swinv2-T	ViT-Base-32	92.44	87.31	85.44	86.47	90.17	88.37

Getting Started

Conda Environment Set-up

Please run:

conda env create -f capenext_env.yml
conda activate capenext

MP-100 Dataset

Please follow the official guide to prepare the MP-100 dataset for training and evaluation, and organize the data structure properly.

Then, use Pose Anything's updated annotation file, with all the skeleton definitions, from the following link.

Training

Backbone

Pretrained weights of Swin-Transformer-V2-Tiny are taken from this repo, in the following link. Pretrained weights should be placed in the ./pretrained folder.

Training

To train the model, run:

python train.py --config [path_to_config_file]  --work-dir [path_to_work_dir]

For example:

# capenext setting
python train.py --config configs/clip/clip_split1_config.py \
    --work-dir work_dirs/tiny/clip/capenext/split1 --cfg-options data.samples_per_gpu=32 data.workers_per_gpu=32 \
    additional_module_cfg.module_name="SimpleMultiModalModule"

Evaluation and Pretrained Models

Here we provide the evaluation results of our pretrained models on MP-100 dataset along with the config files and checkpoints:

Setting	Backbone	split 1	split 2	split 3	split 4	split 5	Average
CapeNext	Swinv2-Tiny	92.44	87.31	85.44	86.47	90.17	88.37
		weight / config	weight / config	weight / config	weight / config	weight / config

Evaluation

To evaluate the pretrained model, run:

python test.py [path_to_config_file] [path_to_pretrained_ckpt]

For example:

# capenext setting
python test.py configs/clip/clip_split1_config.py \
    work_dirs/tiny/clip/capenext/split1/split1_epoch_200.pth \
    --cfg-options additional_module_cfg.module_name="SimpleMultiModalModule"

Acknowledgement

Our code is based on code from:

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
configs		configs
models		models
sbatch_scripts		sbatch_scripts
script		script
tools		tools
.gitignore		.gitignore
AAAI_2026_Camera_Ready_Supplementary_Material.pdf		AAAI_2026_Camera_Ready_Supplementary_Material.pdf
README.md		README.md
add_skeletons_to_mp78_from_mp100.py		add_skeletons_to_mp78_from_mp100.py
app.py		app.py
capenext_env.yml		capenext_env.yml
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CapeNext

Introduction

Results on MP-100 dataset

Getting Started

Conda Environment Set-up

MP-100 Dataset

Training

Backbone

Training

Evaluation and Pretrained Models

Evaluation

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

yzrs/CapeNext

Folders and files

Latest commit

History

Repository files navigation

CapeNext

Introduction

Results on MP-100 dataset

Getting Started

Conda Environment Set-up

MP-100 Dataset

Training

Backbone

Training

Evaluation and Pretrained Models

Evaluation

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages