arith25-stochastic-rounding

Support materials for "On Stochastic Rounding with Few Random Bits", Fitzgibbon and Felix, ARITH 2025

This repository is a fork of nanoGPT with changes in order to enable quantization-aware training with various float formats and definitions of stochastic rounding from the gfloat library.

The original nanoGPT readme is in README-nanoGPT.

Installation

First set up by following README-nanoGPT.

Then install the gfloat and awfutils packages:

pip install git+https://github.com/awf/awfutils@7e99007
pip install git+https://github.com/graphcore-research/gfloat@c332c01

At this point, a basic command such as

python train.py config/train_shakespeare_char.py

should run.

Then, to train with quantization to binary8p4, using stochastic rounding:

python train.py config/train_shakespeare_char.py --qat=b8p4 --qat_rnd=sr

Running the paper's experiments

To regenerate figure 3, run

python train.py config/train_shakespeare_char.py --dtype=bfloat16 --qat=float16 --qat_rnd=tne
python train.py config/train_shakespeare_char.py --dtype=bfloat16 --qat=b8p4 --qat_rnd=tne
python train.py config/train_shakespeare_char.py --dtype=bfloat16 --qat=b8p4 --qat_srn=3 --qat_rnd=sr  # Called "SRC" in the paper
python train.py config/train_shakespeare_char.py --dtype=bfloat16 --qat=b8p4 --qat_srn=3 --qat_rnd=srf
python train.py config/train_shakespeare_char.py --dtype=bfloat16 --qat=b8p4 --qat_srn=3 --qat_rnd=srff

To regenerate figure 4, run

python train.py config/train_shakespeare_char.py --dtype=bfloat16 --qat=float16 --qat_rnd=tne
python train.py config/train_shakespeare_char.py --dtype=bfloat16 --qat=b8p4 --qat_rnd=tne
python train.py config/train_shakespeare_char.py --dtype=bfloat16 --qat=b8p4 --qat_srn=3 --qat_rnd=sr  # Called "SRC" in the paper
python train.py config/train_shakespeare_char.py --dtype=bfloat16 --qat=b8p4 --qat_srn=3 --qat_rnd=srf
python train.py config/train_shakespeare_char.py --dtype=bfloat16 --qat=b8p4 --qat_srn=3 --qat_rnd=srff

Name		Name	Last commit message	Last commit date
Latest commit History 214 Commits
assets		assets
config		config
data		data
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README-nanoGPT.md		README-nanoGPT.md
README.md		README.md
bench.py		bench.py
configurator.py		configurator.py
derivations.ipynb		derivations.ipynb
model.py		model.py
patch.txt		patch.txt
requirements.txt		requirements.txt
sample.py		sample.py
scaling_laws.ipynb		scaling_laws.ipynb
sweep.py		sweep.py
train.py		train.py
transformer_sizing.ipynb		transformer_sizing.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

arith25-stochastic-rounding

Installation

Running the paper's experiments

About

Releases

Packages

Languages

License

graphcore-research/arith25-stochastic-rounding

Folders and files

Latest commit

History

Repository files navigation

arith25-stochastic-rounding

Installation

Running the paper's experiments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages