In the workshop, we take a simple machine learning project from a Jupyter Notebook into an automated ML training pipeline. We start by adding experiment tracking to the notebook using MLflow. Then, we refactor the project into a pipeline of data assets using dagster.
- Step 1: Experiment tracking
- Step 2: Data orchestration
- Step 3: IO management
- Step 4: Configuration
- Step 5: Automation and scheduling
This repository provides the code skeleton that will be used to implement the project.