Skip to content
/ H3 Public
forked from HazyResearch/H3

Language Modeling with the H3 State Space Model

Notifications You must be signed in to change notification settings

kashif/H3

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hungry Hungry Hippos (H3)

This repository provides the official implementation of H3 from the following paper.

Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Tri Dao*, Daniel Y. Fu*, Khaled K. Saab, Armin W. Thomas, Atri Rudra, Christopher Ré
International Conference on Learning Representations, 2023. Notable top-25% (spotlight). Paper: https://arxiv.org/abs/2212.14052

H3

Code & model release

You can find model weights on the HuggingFace Hub here (under "Files and Versions" for each model):

An example of how to load the weights is given in benchmarks/benchmark_generation.py. More examples coming soon!

Acknowledgments

Some of the files related to S4D and HiPPO initialization are adapted from the https://github.com/HazyResearch/state-spaces.

Citation

If you use this codebase, or otherwise found our work valuable, please cite:

@inproceedings{dao2023hungry,
  title={Hungry {H}ungry {H}ippos: Towards Language Modeling with State Space Models},
  author={Dao, Tri and Fu, Daniel Y. and Saab, Khaled K. and Thomas, Armin W.
  and Rudra, Atri and R{\'e}, Christopher},
  booktitle={International Conference on Learning Representations},
  year={2023}
}

About

Language Modeling with the H3 State Space Model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Assembly 88.2%
  • Pawn 4.9%
  • HTML 2.3%
  • C++ 1.9%
  • POV-Ray SDL 1.0%
  • Cuda 0.8%
  • Other 0.9%