In this repo only two components of MLflow are used:
- tracking - for tracking parameters, metrics and artefacts
- projects - for separating environments for executing steps in pipeline
MLflow can be used locally without sharing experiments with the team.
To run locally:
- install necessary packages
pip install -r requirements.txt
- in one terminal run
mlflow server --default-artifact-root <PATH_WHERE_TO_STORE_ARTIFACTS>
To set up simple remote tracking server on EC2 follow instructions in tracking_server
directory
To use such server:
- (install and) configure aws cli
- change
tracking_uri
inconfig.yaml
to point to tracking server
After preparing tracking server:
- execute to run full pipeline:
cd <REPO_ROOT>
mlflow run .
- open
tracking_uri
in browser to see MLflow UI
All necessary pipeline configuration options can be found in config.yaml
.
Options specific to model are stored in src/train_model/model_config.json
download_artifact.py
is a helper script showing how artifacts can be downloaded