Skip to content

aboderinsamuel/svrg-vs-sgd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Variance-Reduced Stochastic Optimization (SVRG vs SGD)

This repository implements and compares Stochastic Variance Reduced Gradient (SVRG) and Stochastic Gradient Descent (SGD) on convex optimization problems, with a focus on theoretical properties, convergence behavior, and empirical performance.

The project emphasizes clarity, correctness, and alignment with classical optimization theory.


Overview

Stochastic Gradient Descent (SGD) is a foundational optimization method in machine learning due to its low per-iteration cost. However, the variance of its stochastic gradients prevents fast convergence near the optimum, requiring diminishing step sizes and resulting in sub-linear convergence rates.

Stochastic Variance Reduced Gradient (SVRG) addresses this limitation by constructing a variance-reduced gradient estimator using periodic full-gradient computations. This enables the use of constant step sizes and leads to significantly faster convergence on smooth and strongly convex objectives.

This repository provides:

  • A clean, from-scratch implementation of SGD and SVRG
  • A controlled empirical comparison of their convergence behavior
  • A practical demonstration of variance reduction in stochastic optimization

Algorithms Implemented

Stochastic Gradient Descent (SGD)

  • Uses stochastic gradients computed from individual samples or mini-batches
  • Requires diminishing learning rates for convergence
  • Exhibits sub-linear convergence on strongly convex objectives

Stochastic Variance Reduced Gradient (SVRG)

  • Employs periodic full-gradient snapshots
  • Uses variance-reduced stochastic gradient estimators
  • Achieves linear (geometric) convergence under standard smoothness and convexity assumptions

Experimental Setup

The algorithms are evaluated on:

  • Logistic Regression
  • Ridge Regression

Experiments are conducted on both synthetic and real-world datasets to highlight differences in convergence speed, stability, and final optimization error.


Project Structure

  • src/models.py — Loss functions and gradient computations
  • src/optimizers.py — Implementations of SGD and SVRG
  • src/utils.py — Data loading and preprocessing utilities
  • notebooks/comparison_demo.ipynb — Empirical evaluation and visualization

Usage

  1. Install dependencies:
    pip install -r requirements.txt
    
    
    

References

  • Johnson, R., & Zhang, T. (2013). Accelerating Stochastic Gradient Descent using Predictive Variance Reduction. Advances in Neural Information Processing Systems.

About

A clean, theoretical implementation of Stochastic Variance Reduced Gradient (SVRG) demonstrating linear convergence rates on convex objectives. Includes comparative analysis against vanilla SGD with empirical validation on synthetic and real-world datasets.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors