Build high-performance AI models with modular building blocks
-
Updated
Jan 13, 2025 - Python
Build high-performance AI models with modular building blocks
Tools and experiments with the LongNet model
An implementation of (something like) the LongNet dilated attention paper
implementation of the paper LongNet: Scaling Transformers to 1,000,000,000 Tokens by Jiayu Ding, Shuming Ma, Li Dong, Xingxing Zhang, Shaohan Huang, Wenhui Wang, and Furu Wei.
Add a description, image, and links to the longnet topic page so that developers can more easily learn about it.
To associate your repository with the longnet topic, visit your repo's landing page and select "manage topics."