Convert all files in git repository to .txt files. This is useful for training LLMs on your codebase.
- Create new .env file by copying example.env
cp example.env .env- Add necessary fields. The default fields are good to start with.
GIT_PROJECT_DIRECTORY=/path/to/git/repo
IGNORE_FILES=.env,package-lock.json
IGNORE_DIRS=.git,.vscode,node_modules
SAVE_DIRECTORY=training_data
SKIP_EMPTY_FILES=true- Install dependencies. Using a virtual environment is recommended.
python -m pip install -r requirements.txt- Run program
python main.py- You'll see your data files in the
training_data/directory. This will be different if you changed the path viaSAVE_DIRECTORYin.envfile.
- This program requires Python version 3.6 or later. It uses the f-string formatting technique introduced in Python 3.6.