GitHub - adamvvu/snapshot_ensemble: Train TensorFlow Keras models with cosine annealing and save an ensemble of models with no additional computational expense.

Train TensorFlow Keras models with cosine annealing and save an ensemble of models with no additional computational expense.

snapshot_ensemble

Ensembles of machine learning models have empirically demonstrated state-of-the-art results in many regression and classification tasks. Deep neural networks are popular models given their flexibility and theoretical properties, but ensembling several independent neural networks is often impractical due to the computational expense.

Huang et al. (2017) proposes the simple idea of Snapshot Ensembling, where a single neural network is trained via cyclic learning rate schedules such as cosine annealing (Loshchilov and Hutter, 2017). At the end of each annealing cycle, the model parameters are saved and thus we obtain an ensemble of trained neural networks at the cost of training a single one.

Conceptually, we may think of this as letting the neural network quickly converge by using a decaying learning rate, and then saving the model at several local minima of the loss surface. We may then used the saved models as part of an ensemble for prediction or inference.

This simple library is an implementation of their ideas as a TensorFlow 2 Keras Callback to be used during training.

Documentation

Getting Started

Installation

pip install snapshot-ensemble

Dependencies:

# Required
python >= 3.6
numpy
tensorflow >= 2.0

# Suggested
matplotlib

Usage

from snapshot_ensemble import SnapshotEnsembleCallback

model = # Compiled TensorFlow 2 Keras model

# Train the Keras model with Cosine Annealing + Snapshot Ensembling
snapshotCB = SnapshotEnsembleCallback()
model.fit(*args,
          callbacks = [ snapshotCB ]
        )

# Snapshotted models are then automatically saved (default: `Ensemble/`)
# and may be loaded in for ensembled predictions or inference

Dynamic Learning Rate Schedule

The learning rate schedule inside SnapshotEnsembleCallback takes the following parameters:
-cycle_length : Initial number of epochs per cycle
-cycle_length_multiplier : Multiplier on number of epochs per cycle
-lr_init : Initial maximum learning rate
-lr_min : Initial minimum learning rate
-lr_multiplier : Multiplier on learning rate per cycle

The cycle_length, lr_init, and lr_min parameters control the initial length and learning rate bounds of each cycle. The *_multiplier parameters allow for dynamically adjusting the length and/or learning rate bounds as training progresses. It is very likely that the default parameters are suboptimal for your task, so these hyperparameters will need to be tuned. There is a helper function VisualizeLR() to visualize the learning rate schedule.

(Left) Standard Cosine Annealing (Middle) Dynamic length (Right) Dynamic length and learning rate bounds

Example

For a simple example, see this notebook.

References

Huang, G., Li, Y., & Pleiss, G. (2017). Snapshot Ensembles: Train 1, Get M for Free. International Conference on Learning Representations. https://doi.org/https://doi.org/10.48550/arXiv.1704.00109

Loshchilov, I., & Hutter, F. (2017). SGDR: Stochastic Gradient Descent with Warm Restarts. International Conference on Learning Representations. https://doi.org/https://doi.org/10.48550/arXiv.1608.03983

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github		.github
assets		assets
docs		docs
examples		examples
snapshot_ensemble		snapshot_ensemble
tests		tests
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

snapshot_ensemble

Getting Started

Installation

Dependencies:

Usage

Dynamic Learning Rate Schedule

Example

References

About

Releases

Packages

Languages

License

adamvvu/snapshot_ensemble

Folders and files

Latest commit

History

Repository files navigation

snapshot_ensemble

Getting Started

Installation

Dependencies:

Usage

Dynamic Learning Rate Schedule

Example

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages