GitHub - trilokpadhi/LLM-from-Scratch: This repository dedicated to implementing and exploring the inner workings of LLMs by building them from the ground up using basic programming and mathematical concepts. The goal is to delve deeper into the mechanics of LLMs

This repository is a motivation to implement LLMs from Scratch using PyTorch. The repository is inspired by the Hugging Face Transformers, The Annotated Transformer, and The Illustrated Transformer repositories. The goal is to understand the architecture and the implementation details of the LLMs. The repository is a work in progress and will be updated regularly.

Introduction

The LLMs are a type of neural network that is trained to predict the next word in a sentence given the previous words. The LLMs are used in various NLP tasks such as text generation, machine translation, and sentiment analysis. The LLMs are based on the Transformer architecture, which is a type of neural network that uses self-attention mechanism to process the input sequence. It is composed of an encoder and a decoder, which are used to process the input sequence and generate the output sequence, respectively. The LLMs are trained using a large corpus of text data, which is used to learn the patterns in the text data and generate the output sequence.

Architecture

The LLMs are based on the Transformer architecture, which is composed of an encoder and a decoder, which are used to process the input sequence and generate the output sequence, respectively. The encoder is used to process the input sequence and generate a representation of the input sequence, which is used by the decoder to generate the output sequence. The encoder and decoder are composed of multiple layers of self-attention mechanism, which is used to process the input sequence and generate the output sequence. The self-attention mechanism is used to compute the attention weights between the input sequence and the output sequence, which are used to generate the output sequence.

Implementation

The implementation of the LLMs is done using PyTorch, which is a popular deep learning library in Python. The implementation is based on the Transformer architecture, which is a type of neural network that uses self-attention mechanism to process the input sequence. The implementation is done in a modular way, which allows for easy customization of the architecture and the training process. The implementation is inspired by the The Annotated Transformer, and The Illustrated Transformer, which provide a detailed explanation of the Transformer architecture and its implementation details.

Usage

The repository contains the implementation of the LLMs in PyTorch. The implementation is done in a modular way, which allows for easy customization of the architecture and the training process. The repository contains the following files:

transformer.py: Contains the implementation of the Transformer architecture.
train.py: Contains the training script for the LLMs.
generate.py: Contains the generation script for the LLMs.

To train the LLMs, run the following command:

python train.py

To generate text using the trained LLMs, run the following command:

python generate.py

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
transformer.py		transformer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents

Introduction

Architecture

Implementation

Usage

References

Blogs

Papers

Repositories

Courses

About

Releases

Packages

trilokpadhi/LLM-from-Scratch

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

Introduction

Architecture

Implementation

Usage

References

Blogs

Papers

Repositories

Courses

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages