Bottom-up leading macroeconomic indicators: An application to non-financial corporate defaults using machine learning
This repository holds the replication files for experiments from:
Tyler Pike, Horacio Sapriza, and Tom Zimmermann. Bottom-up leading macroeconomic indicators: An application to non-financial corporate defaults using machine learning September 2019. Working Paper
Abstract: This paper constructs a leading macroeconomic indicator from microeconomic data using recent machine learning techniques. Using tree-based methods, we estimate probabilities of default for publicly traded non-financial firms in the United States. We then use the cross-section of out-of-sample predicted default probabilities to construct a leading indicator of non-financial corporate health. The index predicts real economic outcomes such as GDP growth and employment up to eight quarters ahead. Impulse responses validate the interpretation of the index as a measure of financial stress.
-
Prepare data
-
create firm-level data
script: corp_default_data_creation
output: corp_default_data_fisd.RDS -
calculate summary statistics and latex tables
script: corp_default_data_tables
output: latex table in terminal -
create summary charts of firm-level data
script: corp_default_data_charts
output: /Figures/observed_data.pdf
-
-
Create NFCH
-
estimate firm-level defaults
script: corp_default_estimation
input: corp_default_data_fisd.RDS
output: /Estimation/[predicted values], /Estimation/[tuning AUC charts][ROC curves][variable importance] -
calculate cross-sectional moments and aggregate firm-level probabilities into macro index
script: corp_default_index_creation
input: /Estimatin/[predicted values]
output: /Data/Output/indexes_aggregate.csv -
chart aggregate index
script: corp_default_index_charts
input: /Data/Output/indexes_aggregate.csv
output: /Evaluation/Index/[charts]
-
-
Validate index
-
compare in- and out-of-sample AUCs of machine learning methods
script: corp_default_auc
input: /Estimation/[predicted values]
output: /Evaluation/AUC/auc_plot.pdf -
estimate impulse response functions
script: corp_default_IRF
input: /Data/Output/indexes_aggregate.csv
output: /Evaluation/JordaIRF/[charts] -
perform forecasting exercises
scripts: corp_default_forecast_insample, corp_default_forecast_outsample
input: /Data/Output/indexes_aggregate.csv
output: latex tables in terminal
-
Set up
- The required output folders are included in this repo, however, if they need to be reinstantiated, then running the
initialize.sh
script will create the needed directories to run the replication. - One will need access to FRB data sources to fully run this replication, but publicly available data may be substituted in as a proxy in several instances.
Run-time
- File paths assume a unix environment and that the working directory is the project's root.
- The recursive index takes days to run, if constructed from scratch; parallel processing is necessary for reasonable execution times.
- The user's WRDS credentials will have to be supplied in
corp_default_data_creation.R
andcorp_default_index_creation.R
- Files may be run one-at-a-time independently or all together via
replicate.sh