Skip to content

Minimal Bigram Character Model that learns next‑character probabilities from a text corpus using a single embedding lookup, trained with cross‑entropy and sampled autoregressively. Implemented in PyTorch on input.txt, it demonstrates bigram (no long‑range) dependencies and prints a generated sample after training.

Bugitoy/Character-Level-Bigram-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

bigram.py – Minimal Bigram Character Model

What it does

  • Trains a tiny character-level bigram language model on input.txt.
  • Learns next-character probabilities that depend only on the current character (no long-range context).
  • After training, samples text autoregressively.

Key components

  • Data prep: builds a character vocabulary from input.txt, defines encode/decode, splits 90%/10% into train/val.
  • Model: BigramLanguageModel with a single nn.Embedding(vocab_size, vocab_size) producing next-token logits directly.
  • Training: random contiguous blocks from the corpus (block_size), cross-entropy loss, AdamW optimizer.
  • Generation: starts from a zero token and repeatedly samples the next character via softmax.

Important hyperparameters (defaults in file)

  • batch_size=32, block_size=8, learning_rate=1e-2, max_iters=3000.

How to run

python bigram.py

Expected output: periodic train/val losses and a printed sample at the end.

Notes

  • This model captures unigram/bigram statistics only; it cannot model long-range structure.
  • Uses GPU automatically if available (device = 'cuda' if torch.cuda.is_available() else 'cpu').

About

Minimal Bigram Character Model that learns next‑character probabilities from a text corpus using a single embedding lookup, trained with cross‑entropy and sampled autoregressively. Implemented in PyTorch on input.txt, it demonstrates bigram (no long‑range) dependencies and prints a generated sample after training.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages