This project utilizes Bayesian logistic regression to analyze loan default risk using the HMEQ dataset, a widely used dataset in credit risk evaluation. The main objective is to identify significant predictors that affect loan default risk and assess model performance, stability, and accuracy across different preprocessing approaches.
- Tra Tran (s3694890)
- Thomas Bui (s3878174)
- Sharon Vincent (s402489)
To execute this project, follow the steps in the sequence below:
- Run the helper functions necessary for setting up the environment, loading dependencies, and initializing any configurations required for the project.
- Ensure the HMEQ dataset is preprocessed as needed for the analysis.
- Typical preprocessing steps may include handling missing values, scaling or normalizing features, and encoding categorical variables.
- The Bayesian logistic regression model is implemented using JAGS (Just Another Gibbs Sampler).
- Make sure JAGS and necessary R or Python packages are installed.
- Monitor the convergence of the model using diagnostics such as trace plots, Gelman-Rubin statistics, and autocorrelation.
- Convergence diagnostics are critical to ensure reliable posterior estimates.
- Evaluate the model's predictive performance using metrics like accuracy, AUC (Area Under the ROC Curve), and Brier score.
- These metrics provide insights into the model's capability to predict loan defaults effectively.
Convergence diagnostics for the model can be accessed here: RunJAGSOut.