CompreSSM

The Curious Case of In-Training Compression of State Space Models
Paper • ICLR 2026 • Mamba Experiments

This is the main repository accompanying the ICLR 2026 paper The Curious Case of In-Training Compression of State Space Models. It contains the LRU experiments. The Mamba experiments are available in the CompreSSMamba repository.

State Space Models (SSMs) offer parallelizable training and fast inference for long sequence modeling. At their core are recurrent dynamical systems with update costs scaling with state dimension. CompreSSM applies balanced truncation—a classical control-theoretic technique—during training to identify and remove low-influence states based on their Hankel singular values. Models that begin large and shrink during training achieve computational efficiency while maintaining higher performance than models trained directly at smaller dimension.

Installation

git clone https://github.com/camail-official/compressm.git
cd compressm
conda env create -f environment.yaml
conda activate compressm

Data Preparation

The datasetsMNIST and CIFAR will auto-download into the data/ directory. LRA must be manually downloaded from the GitHub page. These datasets should be organized as follows:

path/to/lra_release/
  pathfinder/
    pathfinder32/
    pathfinder64/
    pathfinder128/
    pathfinder256/
  aan/
  listops/

Note on Pathfinder: For optimal performance, preprocess the Pathfinder dataset into a single .npz file:

python scripts/preprocess_pathfinder.py --data-dir /path/to/lra_release/pathfinder32 --resolution 32

This creates pathfinder32_preprocessed.npz which loads faster than individual image files.

Quick Start

# Train baseline (no compression)
python scripts/train.py --config configs/paper/smnist_baseline.yaml --seed 42

# Train with τ=0.01 compression (discard 1% Hankel energy)
python scripts/train.py --config configs/paper/smnist_tau0.01.yaml --seed 42

Paper Reproduction

Config files for all experiments are in configs/paper/:

# sMNIST (Table 2) - 10 seeds
python scripts/reproduce.py configs/paper/smnist_baseline.yaml --seeds 8 42 123 456 789 101 202 303 404 505 --gpu 0
python scripts/reproduce.py configs/paper/smnist_tau0.01.yaml --seeds 8 42 123 456 789 101 202 303 404 505 --gpu 0
python scripts/reproduce.py configs/paper/smnist_tau0.02.yaml --seeds 8 42 123 456 789 101 202 303 404 505 --gpu 0
python scripts/reproduce.py configs/paper/smnist_tau0.04.yaml --seeds 8 42 123 456 789 101 202 303 404 505 --gpu 0

# sCIFAR (Table 3) - 5 seeds
python scripts/reproduce.py configs/paper/scifar_baseline.yaml --seeds 8 42 123 456 789 --gpu 0
python scripts/reproduce.py configs/paper/scifar_tau0.05.yaml --seeds 8 42 123 456 789 --gpu 0
python scripts/reproduce.py configs/paper/scifar_tau0.10.yaml --seeds 8 42 123 456 789 --gpu 0
python scripts/reproduce.py configs/paper/scifar_tau0.15.yaml --seeds 8 42 123 456 789 --gpu 0

# Aggregate results
python scripts/analyse_results.py outputs/paper/ --output results/

Expected Results

Config	Table	τ	Accuracy	Final Dim
`smnist_baseline`	2	0%	~97.3%	256
`smnist_tau0.01`	2	1%	~96.9%	~47
`smnist_tau0.02`	2	2%	~96.9%	~28
`smnist_tau0.04`	2	4%	~95.9%	~13
`scifar_baseline`	3	0%	~86.5%	2304
`scifar_tau0.05`	3	5%	~85.8%	~161
`scifar_tau0.10`	3	10%	~85.7%	~93
`scifar_tau0.15`	3	15%	~84.4%	~57

Code Structure

compressm/
├── models/lru.py                  # LRU model with reduction
├── reduction/
│   ├── hsv.py                     # Hankel singular value computation
│   └── balanced_truncation.py     # Balanced truncation algorithm
├── training/trainer.py            # Training loop with in-training compression
└── data/datasets.py               # sMNIST, sCIFAR loaders

configs/paper/                     # Paper reproduction configs
scripts/
├── train.py                       # Training CLI
├── reproduce.py                   # Multi-seed reproduction
└── analyse_results.py             # Results aggregation

Citation

@misc{chahine2026curiouscaseintrainingcompression,
      title={The Curious Case of In-Training Compression of State Space Models}, 
      author={Makram Chahine and Philipp Nazari and Daniela Rus and T. Konstantin Rusch},
      year={2026},
      eprint={2510.02823},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2510.02823}, 
}

License

MIT License - see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
assets		assets
compressm		compressm
configs		configs
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CompreSSM

Installation

Data Preparation

Quick Start

Paper Reproduction

Expected Results

Code Structure

Citation

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

camail-official/compressm

Folders and files

Latest commit

History

Repository files navigation

CompreSSM

Installation

Data Preparation

Quick Start

Paper Reproduction

Expected Results

Code Structure

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages