Skip to content

Ce-daros/Tinystory-LM-656K-param

Repository files navigation

TinyStoriesV2-656K code

What's Inside

  • train_.ipynb: This one's all about training a language model on the TinyStoriesV2 dataset. We're talking data prep, model setup, training details, and even a bit of model inference.

  • train_tokenizer.ipynb: Here, we're crafting a brand-new tokenizer from scratch, just for our TinyStoriesV2 needs. It's got dataset loading, corpus prep for tokenizing, and the actual tokenizer training.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published