- Python 2.7
- Numpy >= 1.14.2
- Matplotlib >= 2.2.0
- Pandas >= 0.22.0
- Scikit-Learn >= 0.19.1
Bank Marketing dataset is collected from direct marketing campaign of a bank institution from Portuguese.
Marketing campaign can be understood as phone calls to the clients to convince them accept to make a term deposit with their bank.
After each call, they are being noted as to no - being the client did not make a deposit and yes - being the client on call accepted to make a deposit.
The purpose of this project is to predict if the client on call would accept to make a term deposit or not based on the information of the clients.
The Bank Marketing Data Set considered for this project is a small portion (10%) of the entire available data set. The data set has about 4119 rows of data with 19 features and 1 column of Class information.
The main issues of the dataset are:
Preprocessing required to fill unknown values in the dataset
Preprocessing required to decide on usage of categorical data along with continuous data
The data is class imbalanced (Number of class 1 (yes) is very low when compared to the number of Class 0 (no))
Data Analysis Work done for this analysis include:
Understanding of features
Preprocessing of features
K-Nearest Neighbor Classifier
Logistic Regression
Naïve Bayes
Random Forest Classifier
Dimensionality Reduction