Skip to content

Latest commit

 

History

History
16 lines (12 loc) · 971 Bytes

readme.md

File metadata and controls

16 lines (12 loc) · 971 Bytes

transformers python

A GPT-2 implementation using as vanilla python as possible, for future conversion to C++.

steps

  • implement the decoder stack (masked self-attention, feed forward) using numpy
  • implement transformer things (positional encoding)
  • standardize weight-loading schema, test to ensure that it's actually working
  • tally used numpy methods and implement a naive matrix class that supports thoes operations

carried by