The simplest, fastest repository for training/finetuning medium-sized GPTs. This repository contains code for a Transformer-based language model, specifically the Generative Pre-trained Transformer (GPT) model. GPT is a state-of-the-art language model architecture that has achieved impressive results in various natural language processing tasks, including text generation and language understanding. It is directly inspired by Andrej Karpathy's gpt video.
This repository contains the code that summed up from the Karpathy's makemore series, ultimately leading to this. I have implemented the code from makemore series too. You can check it out here: Makemore series
simply, install the dependencies using
pip install -r requirements.txt
To train the GPT model, follow these steps:
- Install the required dependencies:
pip install -r requirements.txt
- Prepare your training data in a text file (
data/train.txt
). - Run the training script:
cd nanoGPT
python gpt.py
You will also get sample outputs once the training script is completed.