Credit Risk Analysis: Predicting Defaults with Machine Learning

1. Introduction

In financial lending, risk is everything. Every borrower represents a probability—will they repay the loan or default? Poorly predicted defaults can cause massive losses, destabilizing entire financial institutions. Accurate credit risk analysis isn’t just a statistical problem; it’s a survival strategy.

Unlike generic machine learning pipelines, this project was built with deep consideration of finance-specific metrics like Information Value (IV) and Weight of Evidence (WOE), ensuring the models aren’t just accurate but interpretable and actionable.

2. Dataset

The dataset used in this project is an anonymized version of the American Express Default Prediction Dataset, comprising borrower-level information such as income, credit limits, previous defaults, and more. Here's a quick look:

Train Dataset: 45,528 records
Test Dataset: 11,383 records
Target Variable: credit_card_default (binary classification: 1 for default, 0 for non-default)

3. Motivation

Traditional credit risk models often rely on static statistical methods that fail to capture complex, non-linear relationships in data. We wanted to push beyond these limitations by building:

A robust pipeline that handles data preprocessing, feature selection, scaling, and class imbalance effectively.
Multiple machine learning models with a custom evaluation framework.
A solution with high interpretability, making it practical for real-world financial institutions to adopt.

4. Methodology

Our workflow is broken down into several key stages:

4.1 Data Preprocessing

Steps Taken:

Imputation of Missing Values:
- Categorical features were imputed using their mode.
- Numerical features were imputed using the median to reduce the impact of outliers.
Dropping Unnecessary Columns: Features like customer_id and name were dropped as they do not contribute to the prediction task.

4.2 Feature Engineering

Feature engineering was a crucial step in this project, involving both statistical filtering and transformations tailored to financial data.

Information Value (IV) Filtering:

We computed the Information Value (IV) for each feature to assess its predictive power.
Features with IV < 0.02 were dropped, ensuring that only the most relevant features were retained.

IV quantifies the strength of a feature’s relationship with the target variable—higher IV means stronger predictive power.

Weight of Evidence (WOE) Transformation:

After IV filtering, we applied WOE binning to all remaining features.
WOE scales features in a way that ensures a monotonic relationship with the target, which is crucial for models like Logistic Regression.

4.3 Data Scaling & Handling Imbalance

Scaling:
We applied Min-Max scaling to normalize feature values, ensuring compatibility across different models.
Class Imbalance:
Since the dataset had significantly fewer defaults than non-defaults, we used SMOTE (Synthetic Minority Over-sampling Technique) to balance the classes. This ensures that models don’t become biased toward predicting non-defaults.

4.4 Model Training & Custom Evaluation

Models Trained:

We trained the following models:

Logistic Regression: A baseline model, valued for its simplicity and interpretability.
Decision Tree: Offers inherent interpretability but prone to overfitting.
Random Forest: An ensemble of decision trees that reduces overfitting.
XGBoost: A gradient-boosting model known for its high performance on tabular data.
CatBoost: Another gradient-boosting model, particularly effective for categorical data.
LightGBM: A highly efficient gradient-boosting model.
K-Nearest Neighbors (KNN): Included for comparison, with k=5 chosen based on error analysis.

Custom Evaluation Function:

We created a custom evaluation function to compute and display key metrics:

Accuracy: Overall correctness of predictions.
F1-Score: Balances precision and recall, crucial for imbalanced datasets.
AUC-ROC: Measures a model’s ability to distinguish between defaulters and non-defaulters.

4.5 Model Comparison

Here’s the final comparison of all models:

Model	Train Accuracy	Test Accuracy	Train F1 Score	Test F1 Score	AUC-ROC
Decision Tree	95.40%	95.63%	95.47%	95.70%	98.98%
CatBoost	95.37%	95.63%	95.44%	95.70%	99.05%
Random Forest	95.40%	95.63%	95.47%	95.69%	98.99%
LightGBM	95.37%	95.63%	95.42%	95.68%	99.04%
XGBoost	95.29%	95.48%	95.36%	95.55%	99.03%
KNN	95.03%	95.16%	95.08%	95.20%	98.27%
Logistic Regression	94.30%	94.44%	94.40%	94.52%	98.78%

CatBoost emerged as the best-performing model, with the highest AUC-ROC (99.05%) and near-perfect accuracy and F1-score.

4.6 Generating Predictions

After selecting CatBoost as the best model, we trained it on the entire balanced train dataset and generated predictions on the test dataset. The predictions were saved in the file:

/reports/test_predictions_catboost.csv

5. Key Insights

CatBoost outperformed all other models, making it the ideal choice for deployment in real-world scenarios.
IV filtering and WOE binning significantly improved model interpretability, which is crucial for financial decision-making.
SMOTE balanced the dataset effectively, ensuring that the models didn’t become biased toward predicting non-defaults.

6. Conclusion

This project reimagines credit risk analysis by integrating advanced machine learning techniques with carefully crafted, finance-specific feature engineering. We present a solution that doesn’t just predict credit defaults with high accuracy but does so in a way that’s both insightful and actionable for real-world financial decision-making.

7. Future Work

Hyperparameter Tuning:
Fine-tune the hyperparameters of the best-performing models to squeeze out even better performance.
Explainability Tools:
Integrate tools like SHAP or LIME to provide detailed explanations of individual predictions.
Deployment:
Deploy the final model as a Flask API or Streamlit app for real-time credit risk assessment.

Creator 👨‍💻

If you’re curious about the project or want to collaborate, feel free to connect:

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
data		data
notebooks		notebooks
reports		reports
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Credit Risk Analysis: Predicting Defaults with Machine Learning

Table of Contents

1. Introduction

2. Dataset

3. Motivation

4. Methodology

4.1 Data Preprocessing

Steps Taken:

4.2 Feature Engineering

Information Value (IV) Filtering:

Weight of Evidence (WOE) Transformation:

4.3 Data Scaling & Handling Imbalance

4.4 Model Training & Custom Evaluation

Models Trained:

Custom Evaluation Function:

4.5 Model Comparison

4.6 Generating Predictions

5. Key Insights

6. Conclusion

7. Future Work

Creator 👨‍💻

About

Releases

Packages

Languages

License

shubhupadhyay1/credit-risk-analysis

Folders and files

Latest commit

History

Repository files navigation

Credit Risk Analysis: Predicting Defaults with Machine Learning

Table of Contents

1. Introduction

2. Dataset

3. Motivation

4. Methodology

4.1 Data Preprocessing

Steps Taken:

4.2 Feature Engineering

Information Value (IV) Filtering:

Weight of Evidence (WOE) Transformation:

4.3 Data Scaling & Handling Imbalance

4.4 Model Training & Custom Evaluation

Models Trained:

Custom Evaluation Function:

4.5 Model Comparison

4.6 Generating Predictions

5. Key Insights

6. Conclusion

7. Future Work

Creator 👨‍💻

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages