Skip to content

des137/Machine-Learning-Classifier-Example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MarketingClassification

Contents

1. Background

2. Motivation

2. Results

This repository contains models built on Bank Marketing Data set available from UCI ML repository. The classification goal is to predict wheather a customer will accept the 'CD' (Certificate of Deposit) offer based on various customer related and previous campaign related data.

This notebook quickly performs the basic data exploration to ascertain the intrgrity of the data.

One of the motivation to study this particular problem is to show the end-to-end pipeline feature of the sklearn library. 'sklearn' is a remarkably well designed library which let's one quickly prototype a data flow pipeline and test a variety of machine learning models, by chaining a set of Estimators, Transformers, and Predictors. This notebook demonstrates the applications of the pipeline feature. 10 different models were tested on this particular dataset.

Business decision usually provides a better context for deciding how many False Positives vs. False Negatives are acceptable. Below is a plot between precision, recall, and f1 score plotted against various thresholds:

Among the machine learning models that were tested on this particular dataset, not so surprisingly, Light Gradient Boosting framework produced the best results. Obtained: Gini = 0.87, or equivalently, AUC = 0.93. The metrics/accuracy of the model is equivalent to the an analysis performed using CRISP-DM methodology.

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published