[CVPR-2024] AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic Segmentation
This repo is the official implementation of AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic Segmentation which is accepted at CVPR-2024.
The AllSpark is a powerful Cybertronian artifact in the film series of Transformers. It was used to reborn Optimus Prime in Transformers: Revenge of the Fallen, which aligns well with our core idea.
In this work, we discovered that simply converting existing semi-segmentation methods into a pure-transformer framework is ineffective.
-
The first reason is that transformers inherently possess weaker inductive bias compared to CNNs, so transformers heavily rely on a large volume of training data to perform well.
-
The more critical issue lies in the existing semi-supervised segmentation frameworks. These frameworks separate the training flows for labeled and unlabeled data, which aggravates the overfitting issue of transformers on the limited labeled data.
Thus, we propose to intervene and diversify the labeled data flow with unlabeled data in the feature domain, leading to improvements in generalizability.
First, clone this repo:
git clone https://github.com/xmed-lab/AllSpark.git
cd AllSpark/
Then, create a new environment and install the requirements:
conda create -n allspark python=3.7
conda activate allspark
pip install torch==1.12.0+cu116 torchvision==0.13.0+cu116 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu116
pip install tensorboard
pip install six
pip install pyyaml
pip install -U openmim
mim install mmcv==1.6.2
pip install einops
pip install timm
Download the dataset with wget:
wget https://hkustconnect-my.sharepoint.com/:u:/g/personal/hwanggr_connect_ust_hk/EcgD_nffqThPvSVXQz6-8T0B3K9BeUiJLkY_J-NvGscBVA\?e\=2b0MdI\&download\=1 -O pascal.zip
unzip pascal.zip
Download the dataset with wget:
wget https://hkustconnect-my.sharepoint.com/:u:/g/personal/hwanggr_connect_ust_hk/EWoa_9YSu6RHlDpRw_eZiPUBjcY0ZU6ZpRCEG0Xp03WFxg\?e\=LtHLyB\&download\=1 -O cityscapes.zip
unzip cityscapes.zip
Download the dataset with wget:
wget https://hkustconnect-my.sharepoint.com/:u:/g/personal/hwanggr_connect_ust_hk/EXCErskA_WFLgGTqOMgHcAABiwH_ncy7IBg7jMYn963BpA\?e\=SQTCWg\&download\=1 -O coco.zip
unzip coco.zip
Then your file structure will be like:
├── VOC2012
├── JPEGImages
└── SegmentationClass
├── cityscapes
├── leftImg8bit
└── gtFine
├── coco
├── train2017
├── val2017
└── masks
Next, download the following pretrained weights.
├── ./pretrained_weights
├── mit_b2.pth
├── mit_b3.pth
├── mit_b4.pth
└── mit_b5.pth
For example, mit-B5:
mkdir pretrained_weights
wget https://hkustconnect-my.sharepoint.com/:u:/g/personal/hwanggr_connect_ust_hk/ET0iubvDmcBGnE43-nPQopMBw9oVLsrynjISyFeGwqXQpw?e=9wXgso\&download\=1 -O ./pretrained_weights/mit_b5.pth
# use torch.distributed.launch
sh scripts/train.sh <num_gpu> <port>
# to fully reproduce our results, the <num_gpu> should be set as 4 on all three datasets
# otherwise, you need to adjust the learning rate accordingly
# or use slurm
# sh scripts/slurm_train.sh <num_gpu> <port> <partition>
To train on other datasets or splits, please modify
dataset
and split
in train.sh.
Model weights and training logs will be released soon.
Splits | 1/16 | 1/8 | 1/4 | 1/2 | Full |
---|---|---|---|---|---|
Weights of AllSpark | 76.07 | 78.41 | 79.77 | 80.75 | 82.12 |
Reproduced | 76.06 | log | 78.41 | 79.93 | log | 80.70 | log | 82.56 | log |
Splits | 1/16 | 1/8 | 1/4 | 1/2 |
---|---|---|---|---|
Weights of AllSpark | 78.32 | 79.98 | 80.42 | 81.14 |
Splits | 1/16 | 1/8 | 1/4 | 1/2 |
---|---|---|---|---|
Weights of AllSpark | 78.33 | 79.24 | 80.56 | 81.39 |
Splits | 1/512 | 1/256 | 1/128 | 1/64 |
---|---|---|---|---|
Weights of AllSpark | 34.10 | log | 41.65 | log | 45.48 | log | 49.56 | log |
If you find this project useful, please consider citing:
@inproceedings{allspark,
title={AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic Segmentation},
author={Wang, Haonan and Zhang, Qixiang and Li, Yi and Li, Xiaomeng},
booktitle={CVPR},
year={2024}
}
AllSpark is built upon UniMatch and SegFormer. We thank their authors for making the source code publicly available.