CLSA is a self-supervised learning methods which focused on the pattern learning from strong augmentations.
Copyright (C) 2020 Xiao Wang, Guo-Jun Qi
License: MIT for academic use.
Contact: Guo-Jun Qi (
Representation learning has been greatly improved with the advance of contrastive learning methods. Those methods have greatly benefited from various data augmentations that are carefully designated to maintain their identities so that the images transformed from the same instance can still be retrieved. However, those carefully designed transformations limited us to further explore the novel patterns carried by other transformations. To pave this gap, we propose a general framework called Contrastive Learning with Stronger Augmentations(CLSA) to complement current contrastive learning approaches. As found in our experiments, the distortions induced from the stronger make the transformed images can not be viewed as the same instance any more. Thus, we propose to minimize the distribution divergence between the weakly and strongly augmented images over the representation bank to supervise the retrieval of strongly augmented queries from a pool of candidates. Experiments on ImageNet dataset and downstream datasets showed the information from the strongly augmented images can greatly boost the performance. For example, CLSA achieves top-1 accuracy of 76.2% on ImageNet with a standard ResNet-50 architecture with a single-layer classifier fine-tuned, which is almost the same level as 76.5% of supervised results.
CUDA version should be 10.1 or higher.
1. Install git
git clone && cd CLSA
You have two options to install dependency on your computer:
3.1.1install pip
pip install -r requirements.txt --user
If you encounter any errors, you can install each library one by one:
pip install torch==1.7.1
pip install torchvision==0.8.2
pip install numpy==1.19.5
pip install Pillow==5.1.0
pip install tensorboard==1.14.0
pip install tensorboardX==1.7
3.2.1 install conda
conda create -n CLSA python=3.6.9
conda activate CLSA
pip install -r requirements.txt
Each time when you want to run my code, simply activate the environment by
conda activate CLSA
conda deactivate(If you want to exit)
4.1 Download the ImageNet2012 Dataset under "./datasets/imagenet2012".
4.3 move validation images to labeled subfolders, using the following shell script
This implementation only supports multi-gpu, DistributedDataParallel training, which is faster and simpler; single-gpu or DataParallel training is not supported.
python3 --data=[data_path] --workers=32 --epochs=200 --start_epoch=0 --batch_size=256 --lr=0.03 --weight_decay=1e-4 --print_freq=100 --world_size=1 --rank=0 --dist_url=tcp://localhost:10001 --moco_dim=128 --moco_k=65536 --moco_m=0.999 --moco_t=0.2 --alpha=1 --aug_times=5 --nmb_crops 1 1 --size_crops 224 96 --min_scale_crops 0.2 0.086 --max_scale_crops 1.0 0.429 --pick_strong 1 --pick_weak 0 --clsa_t 0.2 --sym 0
Here the [data_path] should be the root directory of imagenet dataset.
python3 --data=[data_path] --workers=32 --epochs=200 --start_epoch=0 --batch_size=256 --lr=0.03 --weight_decay=1e-4 --print_freq=100 --world_size=1 --rank=0 --dist_url=tcp://localhost:10001 --moco_dim=128 --moco_k=65536 --moco_m=0.999 --moco_t=0.2 --alpha=1 --aug_times=5 --nmb_crops 1 1 --size_crops 224 96 --min_scale_crops 0.2 0.086 --max_scale_crops 1.0 0.429 --pick_strong 1 --pick_weak 0 --clsa_t 0.2 --sym 1
Here the [data_path] should be the root directory of imagenet dataset.
python3 --data=[data_path] --workers=32 --epochs=200 --start_epoch=0 --batch_size=256 --lr=0.03 --weight_decay=1e-4 --print_freq=100 --world_size=1 --rank=0 --dist_url=tcp://localhost:10001 --moco_dim=128 --moco_k=65536 --moco_m=0.999 --moco_t=0.2 --alpha=1 --aug_times=5 --nmb_crops 1 1 1 1 1 --size_crops 224 192 160 128 96 --min_scale_crops 0.2 0.172 0.143 0.114 0.086 --max_scale_crops 1.0 0.86 0.715 0.571 0.429 --pick_strong 0 1 2 3 4 --pick_weak 0 1 2 3 4 --clsa_t 0.2 --sym 0
Here the [data_path] should be the root directory of imagenet dataset.
python3 --data=[data_path] --workers=32 --epochs=200 --start_epoch=0 --batch_size=256 --lr=0.03 --weight_decay=1e-4 --print_freq=100 --world_size=1 --rank=0 --dist_url=tcp://localhost:10001 --moco_dim=128 --moco_k=65536 --moco_m=0.999 --moco_t=0.2 --alpha=1 --aug_times=5 --nmb_crops 1 1 1 1 1 --size_crops 224 192 160 128 96 --min_scale_crops 0.2 0.172 0.143 0.114 0.086 --max_scale_crops 1.0 0.86 0.715 0.571 0.429 --pick_strong 0 1 2 3 4 --pick_weak 0 1 2 3 4 --clsa_t 0.2 --sym 1
Here the [data_path] should be the root directory of imagenet dataset.
With a pre-trained model, we can easily evaluate its performance on ImageNet with:
python3 --data=./datasets/imagenet2012 --dist-url=tcp://localhost:10001 --pretrained=[pretrained_model_path]
[pretrained_model_path] should be the Imagenet pretrained model path.
pre-train network |
pre-train epochs |
Crop | CLSA top-1 acc. |
Model Link |
ResNet-50 | 200 | Single | 69.4 | model |
ResNet-50 | 200 | Multi | 73.3 | model |
ResNet-50 | 800 | Single | 72.2 | model |
ResNet-50 | 800 | Multi | 76.2 | None |
Really sorry that we can't provide CLSA* 800 epochs' model, which is because that we train it with 32 internal GPUs and we can't download it because of company regulations. For downstream tasks, we found multi-200epoch model also had similar performance. Thus, we suggested you to use this model for downstream purposes.
1 Download Dataset under "./datasets/voc"
python3 --data=[VOC_dataset_dir] --pretrained=[pretrained_model_path]
Here VOC directory should be the directory includes "vockit" directory; [VOC_dataset_dir] is the VOC dataset path; [pretrained_model_path] is the imagenet pretrained model path.
1. Install detectron2.
# in detection folder
python3 input.pth.tar output.pkl
3. download VOC Dataset and COCO Dataset under "./detection/datasets" directory,
following the directory structure requried by detectron2.
cd detection
python --config-file configs/pascal_voc_R_50_C4_24k_CLSA.yaml --num-gpus 8 MODEL.WEIGHTS ./output.pkl
cd detection
python --config-file configs/coco_R_50_C4_2x_clsa.yaml --num-gpus 8 MODEL.WEIGHTS ./output.pkl
Contrastive Learning with Stronger Augmentations
title={Contrastive learning with stronger augmentations},
author={Wang, Xiao and Qi, Guo-Jun},
journal={arXiv preprint arXiv:2104.07713},