Deposit Subscriptions Predictions Project

This is a graduate course-level research project completed by Emily Au, Alex Mak, and Zheng En Than in MATH 509 (Data Structures and Platforms) at the University of Alberta. This project strives to predict whether bank clients will subscribe to term deposit subscriptions through tree-based machine-learning classifier models (Decision Tree, Random Forest, and XGBoost).

1. Project Task

We utilize tree-based machine-learning models to predict whether a client will subscribe to a term deposit through direct marketing campaigns.

2. Project Objective

Identifying the significant factors influencing a potential client's decision to subscribe to a term deposit
Determine the predictive accuracy of our classifier models in forecasting subscription outcomes
Observe the predictive performance impact of utilizing bagging and boosting techniques on tree-based machine-learning models

3. Project Structure

Code

Entire codebase of the project (including data preprocessing, feature engineering, predictive modeling, model evaluation, and data visualization).
The previous versions of the codebase are also stored.

Data:

The dataset used in this project, both the raw and processed dataset.
Bank Marketing dataset from UCI (UC Irvine) machine learning repository (https://archive.ics.uci.edu/dataset/222/bank+marketing).

Model:

The fitted Model and their corresponding parameters after being trained in this project.

Report

The finalized report of our project.
The legacy version of the report is also stored.

Visualisations

The visualizations generated from Python (matplotlib and seaborn), and Tableau.
An influential presetnation to convey our findings and insights

4. Project Overview

We have conducted the following steps in our project:

Data Preprocessing
(data cleaning and transformation, anomaly detection analysis, exploratory data analysis)
Feature Engineering
(feature importance, feature selection)
Statistical Machine learning Model Development
(model training and fitting, model evaluation, model optimization, model prediction)
Data Visualization
(within and between models)

5. Project Key Insights

The most important features are: last contact duration, outcome of the previous marketing campaign, and day of year.
Bagging and boosting bring performance improvement from the Decision Tree for this specific problem and dataset.
Numerical Results:

Model	Training Accuracy	Testing Accuracy	Tuning Combinations	Compuation Time
Decision Tree	86.76%	89.04%	2592	~ 10 Minutes
Random Forest	91.49%	90.22%	1024	~ 20 Minutes
XGBoost	92.38%	91.00%	576	~ 40 Minutes

6. Model Deployment

The optimized models implemented in this project are deployed in a streamlit web application!
Please clone this repo, then go to Code --> Model_Deployment, and enter the folloiwng command:

streamlit run Deployment_Codebase.py

The following screenshots are what the app looks like when it's deployed.

Initialization

Successful Prediction

Failed Prediction

7. Project Critique

Ensemble methods (in Random forest and XGBoost) can be more complex than Decision Tree, making it challenging to interpret the reasoning behind each prediction.
Limited generalizability as the dataset consists of data from a Portuguese bank and its specific marketing approach.

8. Further Improvements & Investigation

We would like to re-examine this project with a different dataset, where it may come from another bank in the world with a different telemarketing campaign.
We are interested in further optimizing our tree-based machine learning models, but that also comes with the drawback of consuming additional computational resources.
We are looking forward to implementing gradient-boosted random forest (GBRF), which incorporates both bagging and boosting in a tree-based model. We can analyze the impact of using both bagging and boosting compared to just one of them at a time in Random Forest and Decision Tree.
We would conduct more in-depth analysis, such as exploring any temporal patterns or clustering the data based on client demographics to provide deeper insights into customer behavior, ultimately helping banks devise more effective targeted marketing strategies.

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
Code		Code
Data		Data
Model		Model
Report		Report
Visualizations		Visualizations
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deposit Subscriptions Predictions Project

1. Project Task

2. Project Objective

3. Project Structure

Code

Data:

Model:

Report

Visualisations

4. Project Overview

5. Project Key Insights

6. Model Deployment

7. Project Critique

8. Further Improvements & Investigation

If you are interested to know more about our project, please feel free to visit our report to see our work in detail!

About

Releases

Packages

Contributors 3

Languages

License

Alex-Mak-MCW/Deposit_Subcriptions_Predictions_Project

Folders and files

Latest commit

History

Repository files navigation

Deposit Subscriptions Predictions Project

1. Project Task

2. Project Objective

3. Project Structure

Code

Data:

Model:

Report

Visualisations

4. Project Overview

5. Project Key Insights

6. Model Deployment

7. Project Critique

8. Further Improvements & Investigation

If you are interested to know more about our project, please feel free to visit our report to see our work in detail!

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages