A simple template for research project repos. You can also use data science and reproducible science cookie cutters.
Run the following
./install.sh PATH_TO_YOUR_PROJECT_REPO
For instance,
./install.sh ../my_project/
This script creates the following folders and files.
data
for raw & derived datasets.libs
for librares for the project.models
for trained models.notebooks
for (timestamped) experiment notebooks.paper
for manuscripts.results
for results (figures, tables, etc.)workflow
for workflow files and scripts..gitignore
for temporary and binary files to be ignored by git (LaTeX, Python, Jupyter, data files, etc.)
Change the PROJ_NAME
variable in Makefile
to your project name.
Then create a virtual environment either with Python's vanilla virtualenv
module or with Anaconda.
You can also use tools like poetry
.
.envrc
allows automatic activation of virtual environment. See direnv.
Create a virtual environment in the current directory (inside .venv
).
uv venv
source .venv/bin/activate
Create a requirements.in
file that lists high-level package requirements. For instance,
pandas
matplotlib
jupyter
# local libraries
-e ./libs/xxxx
Install them directly or create a lock file first (note that this lock file is platform-specific and may not translate into other systems) and then install it.
uv pip install -r requirements.in
uv pip compile requirements.in -o requirements.txt
uv pip install -r requirements.txt
First create a virtual environment for the project.
make create_conda_env
and activate it with
conda activate PROJNAME
or deactivate it with
conda deactivate
Use conda install
to install packages. Thanks to nb_conda
package, you
don't need to individually install ipykernel
for Jupyter.
For the project package, use pip install -e
command to install it as an
"editable" package that does not require reinstallation after changes.