writing a Transformer (base for LLMs) in Rust
- Ashish Vaswani et al. (2017). Attention Is All You Need. Read the Paper (PDF)
- Alec Radford et al. Language Models are Unsupervised Multitask Learners. Read the Paper (PDF)
- Harvard NLP Group. The Annotated Transformer. Read the Article
- 3Blue1Brown Deep-Learning Series. Introduction to Transformers , Attention in Transformers
This project is licensed under the MIT License. See the LICENSE file for details.