Personal re-implementations of Machine Learning papers. Re-implementations might use different hyper-parameters / datasets / settings compared to the original paper.
Current re-implementations include:
| Paper | Code | Blog |
|---|---|---|
| Natural language processing | ||
| A Watermark for Large Language Models | Code | |
| Attention is all you need | Code | Coming soon |
| BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | Code | Coming soon |
| Language Modeling Is Compression | Code | |
| Language Models are Few-Shot Learners | Code | Coming soon |
| Computer Vision | ||
| An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale | Code | Blog |
| Denoising Diffusion Probabilistic Models | Code | Blog |
| Density estimation using Real NVP | Code | |
| Idempotent Generative Network | Code | Blog |
| ViR: Vision Retention Networks | Code | Blog |
| Reinforcement Learning | ||
| Proximal Policy Optimization Algorithms | Code | Blog |
| Playing Atari with Deep Reinforcement Learning | Code | Coming soon |
| Others | ||
| Everything is Connected: Graph Neural Networks | Code | |
| Fast Feedforward Networks | Code |
While this repo is a personal attempt to familiarize with the ideas down to the nitty gritty details, contributions are welcome for re-implementations that are already on the repository. In particular, I am open to discuss doubts, questions, suggestions to improve the code, and spotted mistakes / bugs. If you would like to contribute, simply raise an issue before submitting a pull request.
The code is released with the MIT license.