gpt-1-from scratch with pytorch

pretraining GPT 1 from scratch. Implemting multihead attention (MHA) using Pytorch from: Improving Language Understanding by Generative Pre-Training (https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE		LICENSE
README.md		README.md
the_real_tinygpt.py		the_real_tinygpt.py

Provide feedback