In the following, we explain the function of each part in the training scripts, i.e., LUSS50, LUSS300 and LUSS919.
We conduct pretraining with our proposed Non-contrastive pixel-to-pixel representation alignment and Deep-to-shallow supervision.
CUDA_VISIBLE_DEVICES=${CUDA} python -m torch.distributed.launch --nproc_per_node=${N_GPU} main_pretrain.py \
--arch ${ARCH} \
--data_path ${DATA}/train \
--dump_path ${DUMP_PATH} \
--nmb_crops 2 \
--size_crops 224 \
--min_scale_crops 0.08 \
--max_scale_crops 1.0 \
--crops_for_assign 0 1 \
--temperature 0.1 \
--epsilon 0.05 \
--sinkhorn_iterations 3 \
--feat_dim 128 \
--hidden_mlp ${HIDDEN_DIM} \
--nmb_prototypes ${NUM_PROTOTYPE} \
--queue_length ${QUEUE_LENGTH} \
--epoch_queue_starts 15 \
--epochs ${EPOCH} \
--batch_size ${BATCH} \
--base_lr 0.6 \
--final_lr 0.0006 \
--freeze_prototypes_niters ${FREEZE_PROTOTYPES} \
--wd 0.000001 \
--warmup_epochs 0 \
--use_fp16 true \
--sync_bn pytorch \
--workers 10 \
--dist_url ${DIST_URL} \
--seed 10010 \
--shallow 3 \
--weights 1 1
In this part, you should set the --pretrained
as the pretrained weights obtained in step 1.
CUDA_VISIBLE_DEVICES=${CUDA} python -m torch.distributed.launch --nproc_per_node=${N_GPU} main_pixel_attention.py \
--arch ${ARCH} \
--data_path ${IMAGENETS}/train \
--dump_path ${DUMP_PATH_FINETUNE} \
--nmb_crops 2 \
--size_crops 224 \
--min_scale_crops 0.08 \
--max_scale_crops 1. \
--crops_for_assign 0 1 \
--temperature 0.1 \
--epsilon 0.05 \
--sinkhorn_iterations 3 \
--feat_dim 128 \
--hidden_mlp ${HIDDEN_DIM} \
--nmb_prototypes ${NUM_PROTOTYPE} \
--queue_length ${QUEUE_LENGTH_PIXELATT} \
--epoch_queue_starts 0 \
--epochs ${EPOCH_PIXELATT} \
--batch_size ${BATCH} \
--base_lr 0.6 \
--final_lr 0.0006 \
--freeze_prototypes_niters ${FREEZE_PROTOTYPES_PIXELATT} \
--wd 0.000001 \
--warmup_epochs 0 \
--use_fp16 true \
--sync_bn pytorch \
--workers 10 \
--dist_url ${DIST_URL} \
--seed 10010 \
--pretrained ${DUMP_PATH}/checkpoint.pth.tar
Please set the pretrained
as the pretrained weights obtained in step 2.1.
In this part, the center of each cluster will be generated and saved in ${DUMP_PATH_FINETUNE}/cluster/centroids.npy
.
CUDA_VISIBLE_DEVICES=${CUDA} python cluster.py -a ${ARCH} \
--pretrained ${DUMP_PATH_FINETUNE}/checkpoint.pth.tar \
--data_path ${IMAGENETS} \
--dump_path ${DUMP_PATH_FINETUNE} \
-c ${NUM_CLASSES}
The centroid
is a npy file which saves clustering centers.
And the pretrained
should be set as the pretrained weights obtained in step 2.1.
In this step, the val mIoUs under different thresholds will be shown.
CUDA_VISIBLE_DEVICES=${CUDA} python inference_pixel_attention.py -a ${ARCH} \
--pretrained ${DUMP_PATH_FINETUNE}/checkpoint.pth.tar \
--data_path ${IMAGENETS} \
--dump_path ${DUMP_PATH_FINETUNE} \
-c ${NUM_CLASSES} \
--mode validation \
--dist_url ${DIST_URL} \
--test \
--centroid ${DUMP_PATH_FINETUNE}/cluster/centroids.npy
CUDA_VISIBLE_DEVICES=${CUDA} python evaluator.py \
--predict_path ${DUMP_PATH_FINETUNE} \
--data_path ${IMAGENETS} \
-c ${NUM_CLASSES} \
--mode validation \
--curve \
--min 0 \
--max 80
Please set the t
as the best threshold obtained in step 2.3.
CUDA_VISIBLE_DEVICES=${CUDA} python inference_pixel_attention.py -a ${ARCH} \
--pretrained ${DUMP_PATH_FINETUNE}/checkpoint.pth.tar \
--data_path ${IMAGENETS} \
--dump_path ${DUMP_PATH_FINETUNE} \
-c ${NUM_CLASSES} \
--mode train \
--centroid ${DUMP_PATH_FINETUNE}/cluster/centroids.npy \
--dist_url ${DIST_URL} \
-t 0.37
Please set the pseudo_path
as the path that saves pseudo-labels generated in step 2.4.
CUDA_VISIBLE_DEVICES=${CUDA} python -m torch.distributed.launch --nproc_per_node=${N_GPU} main_pixel_finetuning.py \
--arch ${ARCH} \
--data_path ${DATA}/train \
--dump_path ${DUMP_PATH_SEG} \
--epochs ${EPOCH_SEG} \
--batch_size ${BATCH} \
--base_lr 0.6 \
--final_lr 0.0006 \
--wd 0.000001 \
--warmup_epochs 0 \
--use_fp16 true \
--sync_bn pytorch \
--workers 8 \
--dist_url ${DIST_URL} \
--num_classes ${NUM_CLASSES} \
--pseudo_path ${DUMP_PATH_FINETUNE}/train \
--pretrained ${DUMP_PATH}/checkpoint.pth.tar
If you want to evaluate the performance on test set, please set the mode
to test
and submit the generated zip file to our online server.
CUDA_VISIBLE_DEVICES=${CUDA} python inference.py -a ${ARCH} \
--pretrained ${DUMP_PATH_SEG}/checkpoint.pth.tar \
--data_path ${IMAGENETS} \
--dump_path ${DUMP_PATH_SEG} \
-c ${NUM_CLASSES} \
--dist_url ${DIST_URL} \
--mode validation \
--match_file ${DUMP_PATH_SEG}/validation/match.json
CUDA_VISIBLE_DEVICES=${CUDA} python evaluator.py \
--predict_path ${DUMP_PATH_SEG} \
--data_path ${IMAGENETS} \
-c ${NUM_CLASSES} \
--mode validation
bash distance_matching/distance_matching.sh [arch, e.g., resnet50] \
[path to pretrained model] \
[path to save segmentation mask] \
[path to datasets, e.g., /data/ImageNetS/ImageNetS{50/300/919}] \
[number of classes, e.g., 50 300 and 919] \
[validation | test]
On test set, the code will generate a zip file, which could be submitted to the online server.