R-codebase for the Modelling-standard titled "Diagnostics for Binary Classification" (Std-ClassifierDiagnostics), developed using retail mortgage data.
This R-codebase can be run sequentially using the file numbering itself as a structure. Delinquency measures are algorithmically defined in DelinqM.R as data-driven functions, which may be valuable to the practitioner outside of the study's current scope. These delinquency measures were formulated and empirically tested in Botha22, as part of a loss optimisation exercise of recovery decision times, as implemented in the corresponding R-codebase. A simulation study from Botha2021 also demonstrated these delinquency measures at length, with its corresponding R-codebase. Similarly, the TruEnd-procedure from Botha2024 and its corresponding R-codebase is implemented in the TruEnd.R script, which includes a small variety of functions related to running the TruEnd-procedure practically.
This R-codebase assumes that monthly loan performance data is available. Naturally, the data itself can't be made publically available given its sensitive nature, as well as various data privacy laws, particularly the Protection of Personal Information (POPI) Act of 2013 in South Africa. However, the structure and type of data that is required for reproducing this study, is sufficiently described in the commentary within the scripts. This should enable the practitioner to extract and prepare data accordingly. Moreover, this codebase assumes South African macroeconomic data is available, as sourced and collated by internal staff of the bank in question.
All code and scripts are hereby released under an MIT license. Similarly, all graphs produced by relevant scripts as well as those published here, are hereby released under a Creative Commons Attribution (CC-BY 4.0) licence.