The loan approval process is a challenging task for any financial institution. Before giving credit loans to borrowers, the bank decides whether the borrower is bad (defaulter) or good (non-defaulter). This project focuses on developing Machine Learning (ML) models to predict loan eligibility, which is vital in accelerating the decision-making process and determining if an applicant gets a loan or not.
Dream Housing Finance company deals in all home loans. They have a presence across all urban, semi-urban, and rural areas. Customer-first applies for a home loan after that company validates the customer eligibility for a loan.
The company wants to automate the loan eligibility process (real-time) based on customer detail provided while filling the online application form. These details are Gender, Marital Status, Education, Number of Dependents, Income, Loan Amount, Credit History, and others. To automate this process, they have given a problem to identify the customer's segments, those are eligible for loan amount so that they can specifically target these customers.
-
Analyze customer data provided in data set (EDA)
-
Build various ML models that can predict loan approval
Task | Technique | Tools/Packages Used |
---|---|---|
Data Collection | Using dataset available in Kaggle | |
Data Cleaning | Drop unwanted columns, add new columns, deal with missing values | pandas |
Data Visualization | Multi-attribute plots | matplotlib, seaborn |
Data Preprocessing | Feature Encoding, Feature Engineering (deal with ouliersa and imbalanced data), Feature Scaling (Normalization data) | sklearn (LabelEncoder, SMOTE, MinMaxScaler), pandas (get_dummies), numpy(log) |
Data Modeling | Supervised Machine Learning Models using Logistic Regression and Random Forest | sklearn |
Environments & Platforms | Jupyter Notebook, Kaggle |
Below are some key insights that were generated as a result of exploratory data analysis (EDA).
- The one whose salary is more can have a greater chance of loan approval.
- The one who is graduate has a better chance of loan approval.
- Married people would have a upper hand than unmarried people for loan approval .
- The applicant who has less number of dependents have a high probability for loan approval.
- The lesser the loan amount the higher the chance for getting loan.
- Better credit history will have the higher chance of loan approval.
Below are the machine learning models used for predicting whether a bank loan is approved or not.
Machine Learning Models | Accuracy | Precision | Recall | AUC Score |
---|---|---|---|---|
1. Logistic Regression | 0.83 | 0.81 | 0.98 | 0.72 |
2. Random Forest Classifier | 0.79 | 0.81 | 0.92 | 0.75 |