This is a challenge project that demonstrates how to use Prefect and MLflow to build, track, and deploy an Iris classification model. Students will need to complete the implementation of various components while following MLOps best practices.
In this challenge, you will implement a machine learning pipeline using Prefect and MLflow. The pipeline will:
- Load and preprocess the Iris dataset
- Train a Random Forest classifier
- Track experiments with MLflow
- Deploy the model using Prefect
Your task is to complete the implementation in the following files:
flows/iris_flow_mlflow.py
: Implement the main ML pipeline (create this file if it doesn't exist)init_blocks/init.py
: Set up Prefect blocks and variablesdeployments/create_deployment.py
: Create a deployment for the flowdocker-compose.yml
: Complete the prefect-init service configuration
The test suite in tests/
will help validate your implementation.
This challenge requires your repository to be named exactly "prefect-iris-classification". This is crucial for the tests to pass, as they rely on specific paths and configurations. Make sure to:
- Name your repository "prefect-iris-classification" when you create it
- Use this exact name in your GITHUB_REPOSITORY environment variable
- Keep all file paths and names as they are in the original repository
The flow file must be created at exactly this location:
flows/
└── iris_flow_mlflow.py # Create this file with the implementation
This specific path is required because:
- The deployment configuration expects this exact path
- The tests look for this specific file location
- The Docker setup mounts this path
- Docker and Docker Compose
- Git
- GitHub Personal Access Token with
repo
scope
- Clone the repository:
git clone https://github.com/your-username/prefect-iris-classification.git
cd prefect-iris-classification
- Create a
.env
file in the root directory with the following content:
PREFECT_API_URL=http://127.0.0.1:4200/api
GH_TOKEN=your_github_token
GITHUB_REPOSITORY=your-username/prefect-iris-classification # Must use this exact repository name
GITHUB_BRANCH=main
POSTGRES_PASSWORD=prefect
Replace:
your_github_token
with your actual GitHub Personal Access Tokenyour-username
with your GitHub username
-
Main Flow Implementation (
flows/iris_flow_mlflow.py
):- Implement data loading and preprocessing
- Set up model training with configurable hyperparameters
- Add MLflow tracking
- Configure Prefect tasks and flow
-
Block Initialization (
init_blocks/init.py
):- Set up GitHub credentials
- Configure repository access
- Create default variables
-
Deployment Creation (
deployments/create_deployment.py
):- Create deployment from GitHub source
- Configure work pool and tags
-
Docker Compose Configuration (
docker-compose.yml
):- Complete the prefect-init service configuration
- Set up proper dependencies and environment variables
- Configure command sequence for initialization
Start all services using Docker Compose:
docker compose up -d
This will start:
- Prefect Server (UI available at http://127.0.0.1:4200)
- MLflow Server (UI available at http://127.0.0.1:5001). Note on the VM, you might need to go to
http://<<your-VM-ip>>:5001
- PostgreSQL database
- Prefect worker
- Required initialization services
Run the test suite to validate your implementation:
docker compose up test
The tests will check:
- Model training functionality
- MLflow tracking integration
- Deployment configuration
- Docker environment setup
After successful completion, you are supposed to have the following output:
test_1 | tests/test_deployment.py::test_deployment_configuration PASSED [ 25%]
test_1 | tests/test_deployment.py::test_deployment_run PASSED [ 50%]
test_1 | tests/test_integration.py::test_end_to_end_flow PASSED [ 75%]
test_1 | tests/test_integration.py::test_docker_deployment PASSED [100%]
test_1 |
test_1 | =============================== warnings summary ===============================
- Open the Prefect UI at http://127.0.0.1:4200
- Navigate to "Deployments"
- Find the "iris-classification-flow/iris-model" deployment
- Click "Run" to start a flow run
- Monitor the run progress in the UI
Run the deployment using the Prefect CLI:
docker compose exec prefect-worker prefect deployment run 'iris-classification-flow/iris-model'
-
Prefect UI (http://127.0.0.1:4200):
- View flow runs and their status
- Monitor task execution
- Check logs and error messages
-
MLflow UI (http://127.0.0.1:5001):
- View experiment tracking
- Compare model metrics
- Access saved model artifacts
Your implementation will be evaluated based on:
- All tests passing successfully
- Proper use of Prefect features (tasks, flows, variables)
- Effective MLflow integration
- Code organization and documentation
- Error handling and logging
- Correct Docker service configuration
-
No deployments visible in UI:
- Ensure all services are running:
docker compose ps
- Check prefect-init logs:
docker compose logs prefect-init
- Verify environment variables in
.env
- Make sure repository name is exactly "prefect-iris-classification"
- Ensure all services are running:
-
Worker not picking up runs:
- Check worker logs:
docker compose logs prefect-worker
- Ensure work pool "ml-pool" exists
- Verify worker is connected to the correct API
- Check worker logs:
-
MLflow tracking issues:
- Check MLflow server logs:
docker compose logs mlflow
- Verify MLflow UI is accessible
- Check tracking URI configuration
- Check MLflow server logs:
.
├── deployments/ # Deployment configuration (to be implemented)
├── flows/ # Prefect flow definitions (to be implemented)
├── init_blocks/ # Prefect block initialization (to be implemented)
├── tests/ # Update lines 28-29 in test_integration.py depending on the docker version)
├── docker-compose.yml # Docker services configuration (partially complete)
├── prefect.Dockerfile # Prefect service image
└── mlflow.Dockerfile # MLflow service image