This project focuses on predicting the closure of small and medium-sized enterprises (SMEs) using Business Trends and Outlook Survey Data. Key aspects include:
- Data Utilization: Leveraged survey data to analyze and predict SME closures.
- Machine Learning Models: Implemented models using R, with packages such as
randomForest
,catboost
, andBART
. - Performance Evaluation: Assessed models with metrics like AUROC, F1 score, and accuracy.
- Key Findings: Highlighted the importance of including non-financial data for accurate closure predictions.
About the project in Korean.pdf
: Comprehensive project documentation in Korean, covering the project overview, data details, ML models used, performance results, and key findings. Includes detailed preprocessing information.About the project.pdf
: Summary of the project in English.Summary statistics.pdf
: Contains summary statistics for the variables used in the analysis.Numble reflections.pdf
: Reflections on the project, written in Korean, detailing insights and lessons learned.
Numble Project.Rmd
: R Markdown file with complete project code, from data preprocessing to model evaluation.Numble Project.R
: R script with all code for data preprocessing, model training, and evaluation.
Due to a contract with the competition organization, the dataset used in this project cannot be uploaded. While the provided code will not include the dataset, it offers a comprehensive understanding of the project’s methodology and analysis.
- Nayeon Kwon - Sourcing non-financial data, data preprocessing, supporting building ML models, documentation
- Younghoon Yoo - Automated data preprocessing, building ML models, code optimization
This project is licensed under the MIT License. See the LICENSE.txt file for details.
Feel free to explore the repository and download the PDFs for detailed information about the Small and Medium-sized Enterprises Closure Prediction Project.