Hands-on MLFlow

Introduction

The MNIST dataset is useful for those who want to try learning techniques and pattern recognition methods on real-world data. The classification task is tackled using classical Machine Learning and Deep Learning approaches. On top of the training loop, there is an experiment tracker that will allow the data practitioner to decide which approach is better.

Setting up the environment

It is recommended to use virtualenvwrapper. Find here the instructions to install and use it.

Environment variables

Set the MLFlow environment variables as follows

export MLFLOW_TRACKING_URI=sqlite:///experiment/mlflow/db/mydb.sqlite
export ARTIFACT_ROOT=./experiment/mlflow/mlruns/

You can also find them in mlflow.cfg.

MLFlow Server

To enable experiment tracking and model registry start MLFlow server as follows:

mlflow server --default-artifact-root $ARTIFACT_ROOT --backend-store-uri $MLFLOW_TRACKING_URI

About the dataset

The MNIST dataset contains 70,000 grayscale small images (28x28) of labeled handwritten digits, from 0 - 9. This problem is often called the "Hello World" of Machine Learning because anyone who learns Machine Learning tackles this problem at any time.

Further information about the dataset can be found on the following web pages:

Some examples of the digits are shown below.

Training Loop

Three basic components:

ETL
Model
- Training
- Evaluation
Deployment

Depending on the model evaluation or data practitioner criteria the ETL or the Model may suffer changes.

Classical Machine Learning

A simplified approach using binary classification, a 5-detector:

Classical Machine Learning
- Scikit-Learn: Stochastic Gradient Descent
- Scikit-Learn: Random Forest

Deep Learning

A multiclass classification with the proper architecture:

Convolutional Neural Network
- Using scaled images
- using n_pca_components to keep 95% of the explained variance

Acknowledgments

This hands-on experience with Computer Vision common projects was inspired by Tensorflow in Practice by Laurence Moroney - Coursera and the concepts explained in Hands-On Machine Learning with Scikit-Learn, Keras & Tensorflow by Aureélien Géron - O'Reily.

Disclaimer

Sections of code were taken from both sources stated in Acknowledgments and all the datasets used in this notebook are open-sourced.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
experiment		experiment
notebooks		notebooks
utils		utils
.gitignore		.gitignore
README.md		README.md
main.py		main.py
mlflow.cfg		mlflow.cfg
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hands-on MLFlow

Table of Contents

Introduction

Setting up the environment

Environment variables

MLFlow Server

About the dataset

Training Loop

Classical Machine Learning

Deep Learning

Acknowledgments

Disclaimer

About

Releases

Packages

Languages

UribeAlejandro/Hands_On_MLFlow

Folders and files

Latest commit

History

Repository files navigation

Hands-on MLFlow

Table of Contents

Introduction

Setting up the environment

Environment variables

MLFlow Server

About the dataset

Training Loop

Classical Machine Learning

Deep Learning

Acknowledgments

Disclaimer

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages