Skip to content

camail-official/compressm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CompreSSM

CompreSSM

The Curious Case of In-Training Compression of State Space Models
Paper • ICLR 2026 • Mamba Experiments

Python 3.10+ JAX License: MIT


This is the main repository accompanying the ICLR 2026 paper The Curious Case of In-Training Compression of State Space Models. It contains the LRU experiments. The Mamba experiments are available in the CompreSSMamba repository.


State Space Models (SSMs) offer parallelizable training and fast inference for long sequence modeling. At their core are recurrent dynamical systems with update costs scaling with state dimension. CompreSSM applies balanced truncation—a classical control-theoretic technique—during training to identify and remove low-influence states based on their Hankel singular values. Models that begin large and shrink during training achieve computational efficiency while maintaining higher performance than models trained directly at smaller dimension.


Installation

git clone https://github.com/camail-official/compressm.git
cd compressm
conda env create -f environment.yaml
conda activate compressm

Data Preparation

The datasetsMNIST and CIFAR will auto-download into the data/ directory. LRA must be manually downloaded from the GitHub page. These datasets should be organized as follows:

path/to/lra_release/
  pathfinder/
    pathfinder32/
    pathfinder64/
    pathfinder128/
    pathfinder256/
  aan/
  listops/

Note on Pathfinder: For optimal performance, preprocess the Pathfinder dataset into a single .npz file:

python scripts/preprocess_pathfinder.py --data-dir /path/to/lra_release/pathfinder32 --resolution 32

This creates pathfinder32_preprocessed.npz which loads faster than individual image files.

Quick Start

# Train baseline (no compression)
python scripts/train.py --config configs/paper/smnist_baseline.yaml --seed 42

# Train with τ=0.01 compression (discard 1% Hankel energy)
python scripts/train.py --config configs/paper/smnist_tau0.01.yaml --seed 42

Paper Reproduction

Config files for all experiments are in configs/paper/:

# sMNIST (Table 2) - 10 seeds
python scripts/reproduce.py configs/paper/smnist_baseline.yaml --seeds 8 42 123 456 789 101 202 303 404 505 --gpu 0
python scripts/reproduce.py configs/paper/smnist_tau0.01.yaml --seeds 8 42 123 456 789 101 202 303 404 505 --gpu 0
python scripts/reproduce.py configs/paper/smnist_tau0.02.yaml --seeds 8 42 123 456 789 101 202 303 404 505 --gpu 0
python scripts/reproduce.py configs/paper/smnist_tau0.04.yaml --seeds 8 42 123 456 789 101 202 303 404 505 --gpu 0

# sCIFAR (Table 3) - 5 seeds
python scripts/reproduce.py configs/paper/scifar_baseline.yaml --seeds 8 42 123 456 789 --gpu 0
python scripts/reproduce.py configs/paper/scifar_tau0.05.yaml --seeds 8 42 123 456 789 --gpu 0
python scripts/reproduce.py configs/paper/scifar_tau0.10.yaml --seeds 8 42 123 456 789 --gpu 0
python scripts/reproduce.py configs/paper/scifar_tau0.15.yaml --seeds 8 42 123 456 789 --gpu 0

# Aggregate results
python scripts/analyse_results.py outputs/paper/ --output results/

Expected Results

Config Table τ Accuracy Final Dim
smnist_baseline 2 0% ~97.3% 256
smnist_tau0.01 2 1% ~96.9% ~47
smnist_tau0.02 2 2% ~96.9% ~28
smnist_tau0.04 2 4% ~95.9% ~13
scifar_baseline 3 0% ~86.5% 2304
scifar_tau0.05 3 5% ~85.8% ~161
scifar_tau0.10 3 10% ~85.7% ~93
scifar_tau0.15 3 15% ~84.4% ~57

Code Structure

compressm/
├── models/lru.py                  # LRU model with reduction
├── reduction/
│   ├── hsv.py                     # Hankel singular value computation
│   └── balanced_truncation.py     # Balanced truncation algorithm
├── training/trainer.py            # Training loop with in-training compression
└── data/datasets.py               # sMNIST, sCIFAR loaders

configs/paper/                     # Paper reproduction configs
scripts/
├── train.py                       # Training CLI
├── reproduce.py                   # Multi-seed reproduction
└── analyse_results.py             # Results aggregation

Citation

@misc{chahine2026curiouscaseintrainingcompression,
      title={The Curious Case of In-Training Compression of State Space Models}, 
      author={Makram Chahine and Philipp Nazari and Daniela Rus and T. Konstantin Rusch},
      year={2026},
      eprint={2510.02823},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2510.02823}, 
}

License

MIT License - see LICENSE for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages