Code Generation Models

Overview:

This experiment is mainly to experiment with domain specific pretraining in the context of code generation language models.

This project has been abandoned in favour of a JAX implementation

Model used:

StableLM Base Alpha 3B

Dataset used:

The Pile (StableLM Base Alpha 3B) for pretrained model

CodeParrot Clean for both

Unconfirmed (Planning)

Arxiv CS papers and other code related datasets without actual unlabelled code (for custom pretraining)

Project underway

Code and benchmarks coming in the near future

Credits

Thanks to the TPU Research Cloud (TRC) program for providing compute resources

Thanks to the various authors whose code I have used to build this. Most notably, Lit-GPT by Lightning AI.

Contact

Please reach out to me at at my email to contact me.

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
chat		chat
finetune		finetune
generate		generate
lit_gpt		lit_gpt
notebooks		notebooks
pretrain		pretrain
quantize		quantize
scripts		scripts
tests		tests
tutorials		tutorials
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code Generation Models

Overview:

This project has been abandoned in favour of a JAX implementation

Model used:

Dataset used:

Unconfirmed (Planning)

Project underway

Credits

Contact

About

Releases

Packages

Languages

License

EC3-Gang/code-llm-pretraining

Folders and files

Latest commit

History

Repository files navigation

Code Generation Models

Overview:

This project has been abandoned in favour of a JAX implementation

Model used:

Dataset used:

Unconfirmed (Planning)

Project underway

Credits

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages