semi-supervised-paper-implementation

This repository is designed to reproduce the methods in some semi-supervised papers.

Before running the code, you need to install the packages according to the following command.

pip3 install torch==1.1.0
pip3 install torchvision==0.3.0
pip3 install tensorflow # we use tensorboard in the project

Prepare datasets

CIFAR-10

Use the following command to unpack the data and generate labeled data path files.

python3 -m semi_supervised.core.utils.cifar10

Run on CIFAR-10

To reproduce the result in Temporal Ensembling for Semi-Supervised Learning, run

CUDA_VISIBLE_DEVICES=0 python3 -m semi_supervised.experiments.tempens.cifar10_test
CUDA_VISIBLE_DEVICES=0 python3 -m semi_supervised.experiments.pi.cifar10_test

To reproduce the result in Mean teachers are better role models. run

CUDA_VISIBLE_DEVICES=0 python3 -m semi_supervised.experiments.mean_teacher.cifar10_test

Note: This code does not be tested on multiple GPUs, so there is no guarantee that the result is satisfying when using multiple GPUs.

Results on CIFAR-10

Number of Labeled Data	1000	2000	4000	All labels
Pi model (from SNTG)	68.35 ± 1.20	82.43 ± 0.44	87.64 ± 0.31	94.44 ± 0.10
Pi model (this repository)	69.615 ± 1.3013	82.92 ± 0.532	87.925 ± 0.227	---
Tempens model (from SNTG)	76.69 ± 1.01	84.36 ± 0.39	87.84 ± 0.24	94.4 ± 0.10
Tempens model (this repository)	78.517 ± 1.1653	84.757 ± 0.42445	88.166 ± 0.24324	94.72 ± 0.14758
Mean Teacher (from Mean teachers)	78.45	84.27	87.69	94.06
Mean Teacher (this repository)	80.421 ± 1.0264	85.236 ± 0.655	88.435 ± 0.311	94.482 ± 0.1086

We report the mean and standard deviation of 10 runs using different random seeds(1000 - 1009).

Training strategies in semi-supervised learning

In semi-supervised learning, many papers use common training strategies. This section introduces some strategies I know.

Learning rate

$\rm{lr} = \rm{rampup\_value} * \rm{rampdown\_value} * \rm{init\_lr}$

You can find out how to compute rampup_value and rampdown_value in semi_supervised/core/utils/fun_utils.py.

The curve of the learning rate is shown in the figure below.

Optimizer

Many methods in semi-supervised learning use Adam optimizer with beta1 = 0.9 and beta2 = 0.999. During training, beta1 is dynamically changed.

$\rm{adam\_beta1} = \rm{rampdown\_value} * 0.9 + (1.0 - \rm{rampdown\_value}) * 0.5$

The curve of beta1 is shown in the figure below.

Consistency Weight

Some methods use dynamically changed weight to balance supervised loss and unsupervised loss.

$\rm{weight} = \rm{init\_weight} * \rm{rampup\_value}$

The curve of consistency weight is shown in the figure below.

TODO list

Mean Teacher
Pi Model
Temporal Ensembling Model
VAT
More....

References

Mean teachers are better role models
Temporal Ensembling for Semi-Supervised Learning
Good Semi-Supervised Learning that Requires a Bad GAN
Smooth Neighbors on Teacher Graphs for Semi-supervised Learning

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

semi-supervised-paper-implementation

Prepare datasets

CIFAR-10

Run on CIFAR-10

Results on CIFAR-10

Training strategies in semi-supervised learning

Learning rate

Optimizer

Consistency Weight

TODO list

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

semi-supervised-paper-implementation

Prepare datasets

CIFAR-10

Run on CIFAR-10

Results on CIFAR-10

Training strategies in semi-supervised learning

Learning rate

Optimizer

Consistency Weight

TODO list

References