Skip to content
This repository has been archived by the owner on Jul 12, 2023. It is now read-only.

Experiment on the impact of pretraining on code generation models.

License

Notifications You must be signed in to change notification settings

EC3-Gang/code-llm-pretraining

Repository files navigation

Code Generation Models

Overview:

This experiment is mainly to experiment with domain specific pretraining in the context of code generation language models.

This project has been abandoned in favour of a JAX implementation

Model used:

StableLM Base Alpha 3B

Dataset used:

The Pile (StableLM Base Alpha 3B) for pretrained model

CodeParrot Clean for both

Unconfirmed (Planning)

Arxiv CS papers and other code related datasets without actual unlabelled code (for custom pretraining)

Project underway

Code and benchmarks coming in the near future

Credits

Thanks to the TPU Research Cloud (TRC) program for providing compute resources

Thanks to the various authors whose code I have used to build this. Most notably, Lit-GPT by Lightning AI.

Contact

Please reach out to me at at my email to contact me.

About

Experiment on the impact of pretraining on code generation models.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published