- Experiment: process of buolding an ML model (i.e. training, hpo, etc.)
- Experiment run: each trial of an ML experiment.
- Run artifact: any file associated with a run.
- Metadata: hyperparameters and all info that serve as inputs.
Process of keeping track of all the relevant information from an ML experiment.
- Source code / Version (commit hash)
- Environment
- Data
- Hyperparameters
- Metrics
- Reproducibility: To be able to reproduce the same result.
- Organization: Important when working in a team with multiple people.
- Optimization: Optimize the ML model in an organized way.
- Spreadsheet
- Error prone -> copy and paste
- No standard format
- Visibility and collaboration
Here where MLFlow tracking comes.
Open source platform for the machine learning lifecycle.
In practice can be installed using pip and comes with different modules:
- Tracking
- Models
- Model Registry
- Projects
Allows you to organize your experiments into runs and keep track of:
- Paremeters: hyperparameters, path to training data, etc.
- Metrics: evaluations metrics
- Metadata: paths, tags to filter runs.
- Artifacts: Visualizartions, dataset (doesn't scale...)
- Models: serialized model.
Along with this, MLFlow automatically logs:
- Source code
- Version of the code (git commit)
- Start and end time
- Author
Start a gunicorn server with the UI.
mlflow ui
All artifacts and metadata will be saved in sqlite (one of the alternatives for the backedn store)
mlflow ui --backend-store-uri sqlite:///mlflow.db
Besides experiment tracking it covers:
- Model versioning
- Model deployment
- Scaling hardware
From the tracking server register the models when are ready for production into a model registry (staging, production, archive). The model registry is not in charge of deploying any model, so in order to actually deploy a model you need to add a CI/CD tool.
Possible states of a model within MLFlow model registry are:
- Staging
- Production
- Archived
Using model tracking tool with model registry tool, allows to have model lineage and know how the artifact inside the registry was generated.