A collection of Small Language Models (SLMs) built from scratch in PyTorch.
Currently Implemented
A compact transformer language model trained on TinyStories dataset, featuring 6 layers, 6 attention heads, and mixed precision training. Generates coherent children's stories with ~1.99 validation loss. View