Usage of PASS

Training

In the following, we explain the function of each part in the training scripts, i.e., LUSS50, LUSS300 and LUSS919.

Step 1: Unsupervised representation learning

We conduct pretraining with our proposed Non-contrastive pixel-to-pixel representation alignment and Deep-to-shallow supervision.

CUDA_VISIBLE_DEVICES=${CUDA} python -m torch.distributed.launch --nproc_per_node=${N_GPU} main_pretrain.py \
--arch ${ARCH} \
--data_path ${DATA}/train \
--dump_path ${DUMP_PATH} \
--nmb_crops 2 \
--size_crops 224 \
--min_scale_crops 0.08 \
--max_scale_crops 1.0 \
--crops_for_assign 0 1 \
--temperature 0.1 \
--epsilon 0.05 \
--sinkhorn_iterations 3 \
--feat_dim 128 \
--hidden_mlp ${HIDDEN_DIM} \
--nmb_prototypes ${NUM_PROTOTYPE} \
--queue_length ${QUEUE_LENGTH} \
--epoch_queue_starts 15 \
--epochs ${EPOCH} \
--batch_size ${BATCH} \
--base_lr 0.6 \
--final_lr 0.0006  \
--freeze_prototypes_niters ${FREEZE_PROTOTYPES} \
--wd 0.000001 \
--warmup_epochs 0 \
--use_fp16 true \
--sync_bn pytorch \
--workers 10 \
--dist_url ${DIST_URL} \
--seed 10010 \
--shallow 3 \
--weights 1 1

Step 2: Pixel-label Generation with Pixel-Attention

Step 2.1: Finetuning pixel attention

In this part, you should set the --pretrained as the pretrained weights obtained in step 1.

CUDA_VISIBLE_DEVICES=${CUDA} python -m torch.distributed.launch --nproc_per_node=${N_GPU} main_pixel_attention.py \
--arch ${ARCH} \
--data_path ${IMAGENETS}/train \
--dump_path ${DUMP_PATH_FINETUNE} \
--nmb_crops 2 \
--size_crops 224 \
--min_scale_crops 0.08 \
--max_scale_crops 1. \
--crops_for_assign 0 1 \
--temperature 0.1 \
--epsilon 0.05 \
--sinkhorn_iterations 3 \
--feat_dim 128 \
--hidden_mlp ${HIDDEN_DIM} \
--nmb_prototypes ${NUM_PROTOTYPE} \
--queue_length ${QUEUE_LENGTH_PIXELATT} \
--epoch_queue_starts 0 \
--epochs ${EPOCH_PIXELATT} \
--batch_size ${BATCH} \
--base_lr 0.6 \
--final_lr 0.0006  \
--freeze_prototypes_niters ${FREEZE_PROTOTYPES_PIXELATT} \
--wd 0.000001 \
--warmup_epochs 0 \
--use_fp16 true \
--sync_bn pytorch \
--workers 10 \
--dist_url ${DIST_URL} \
--seed 10010 \
--pretrained ${DUMP_PATH}/checkpoint.pth.tar

Step 2.2: Clustering

Please set the pretrained as the pretrained weights obtained in step 2.1.

In this part, the center of each cluster will be generated and saved in ${DUMP_PATH_FINETUNE}/cluster/centroids.npy.

CUDA_VISIBLE_DEVICES=${CUDA} python cluster.py -a ${ARCH} \
--pretrained ${DUMP_PATH_FINETUNE}/checkpoint.pth.tar \
--data_path ${IMAGENETS} \
--dump_path ${DUMP_PATH_FINETUNE} \
-c ${NUM_CLASSES}

Step 2.3: Choose the threshold for generating pseudo-labels.

The centroid is a npy file which saves clustering centers. And the pretrained should be set as the pretrained weights obtained in step 2.1.

In this step, the val mIoUs under different thresholds will be shown.

CUDA_VISIBLE_DEVICES=${CUDA} python inference_pixel_attention.py -a ${ARCH} \
--pretrained ${DUMP_PATH_FINETUNE}/checkpoint.pth.tar \
--data_path ${IMAGENETS} \
--dump_path ${DUMP_PATH_FINETUNE} \
-c ${NUM_CLASSES} \
--mode validation \
--dist_url ${DIST_URL} \
--test \
--centroid ${DUMP_PATH_FINETUNE}/cluster/centroids.npy

CUDA_VISIBLE_DEVICES=${CUDA} python evaluator.py \
--predict_path ${DUMP_PATH_FINETUNE} \
--data_path ${IMAGENETS} \
-c ${NUM_CLASSES} \
--mode validation \
--curve \
--min 0 \
--max 80

Step 2.4: Generating pseudo-labels for the training set

Please set the t as the best threshold obtained in step 2.3.

CUDA_VISIBLE_DEVICES=${CUDA} python inference_pixel_attention.py -a ${ARCH} \
--pretrained ${DUMP_PATH_FINETUNE}/checkpoint.pth.tar \
--data_path ${IMAGENETS} \
--dump_path ${DUMP_PATH_FINETUNE} \
-c ${NUM_CLASSES} \
--mode train \
--centroid ${DUMP_PATH_FINETUNE}/cluster/centroids.npy \
--dist_url ${DIST_URL} \
-t 0.37

Step 3: Finetuning with pixel-level pseudo labels

Please set the pseudo_path as the path that saves pseudo-labels generated in step 2.4.

CUDA_VISIBLE_DEVICES=${CUDA} python -m torch.distributed.launch --nproc_per_node=${N_GPU} main_pixel_finetuning.py \
--arch ${ARCH} \
--data_path ${DATA}/train \
--dump_path ${DUMP_PATH_SEG} \
--epochs ${EPOCH_SEG} \
--batch_size ${BATCH} \
--base_lr 0.6 \
--final_lr 0.0006 \
--wd 0.000001 \
--warmup_epochs 0 \
--use_fp16 true \
--sync_bn pytorch \
--workers 8 \
--dist_url ${DIST_URL} \
--num_classes ${NUM_CLASSES} \
--pseudo_path ${DUMP_PATH_FINETUNE}/train \
--pretrained ${DUMP_PATH}/checkpoint.pth.tar

Step 4: Inference

If you want to evaluate the performance on test set, please set the mode to test and submit the generated zip file to our online server.

CUDA_VISIBLE_DEVICES=${CUDA} python inference.py -a ${ARCH} \
--pretrained ${DUMP_PATH_SEG}/checkpoint.pth.tar \
--data_path ${IMAGENETS} \
--dump_path ${DUMP_PATH_SEG} \
-c ${NUM_CLASSES} \
--dist_url ${DIST_URL} \
--mode validation \
--match_file ${DUMP_PATH_SEG}/validation/match.json

Evaluation

Fully unsupervised protocol

CUDA_VISIBLE_DEVICES=${CUDA} python evaluator.py \
--predict_path ${DUMP_PATH_SEG} \
--data_path ${IMAGENETS} \
-c ${NUM_CLASSES} \
--mode validation

Distance matching protocol

bash distance_matching/distance_matching.sh [arch, e.g., resnet50] \
[path to pretrained model] \
[path to save segmentation mask] \
[path to datasets, e.g., /data/ImageNetS/ImageNetS{50/300/919}] \
[number of classes, e.g., 50 300 and 919] \
[validation | test]

On test set, the code will generate a zip file, which could be submitted to the online server.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

USAGE.md

USAGE.md

Usage of PASS

Training

Step 1: Unsupervised representation learning

Step 2: Pixel-label Generation with Pixel-Attention

Step 2.1: Finetuning pixel attention

Step 2.2: Clustering

Step 2.3: Choose the threshold for generating pseudo-labels.

Step 2.4: Generating pseudo-labels for the training set

Step 3: Finetuning with pixel-level pseudo labels

Step 4: Inference

Evaluation

Fully unsupervised protocol

Distance matching protocol

Files

USAGE.md

Latest commit

History

USAGE.md

File metadata and controls

Usage of PASS

Training

Step 1: Unsupervised representation learning

Step 2: Pixel-label Generation with Pixel-Attention

Step 2.1: Finetuning pixel attention

Step 2.2: Clustering

Step 2.3: Choose the threshold for generating pseudo-labels.

Step 2.4: Generating pseudo-labels for the training set

Step 3: Finetuning with pixel-level pseudo labels

Step 4: Inference

Evaluation

Fully unsupervised protocol

Distance matching protocol