Abstract
This paper aims to enhance credit risk assessment for non-financial companies in Romania by developing a machine learning (ML) model to estimate the probability of default. Utilizing an extensive set of microeconomic data, including financial statements, loan-level data from the Credit Risk Register, shareholder structure, export and import activities, and external debt, the model provides a comprehensive analysis of a company’s financial health and risk profile. The ML model employs logistic regression for classification, with 80% of the data used for training and 20% for validation. The model’s performance was evaluated using the receiver operating characteristic curve and confusion matrix, demonstrating an accuracy of 88%. Further validation through point-in-time estimation confirmed the model’s stability. The study is limited by the relatively low number of defaulting companies in the sample and the unique economic disruptions of 2020 due to the COVID-19 pandemic. To account for these factors, a Random Under Sample Boosted Trees approach is employed, which improves the model’s ability to distinguish between defaulted and non-defaulted debtors. Despite these limitations, the research concludes that integrating extensive financial data and advanced ML techniques have the potential to markedly enhance credit risk assessment, providing a reliable tool for financial institutions to manage credit risk effectively. Future improvements could address data imbalance and incorporate more diverse economic conditions to enhance predictive power for defaulting companies.
Reference16 articles.
1. Alam, T. M., Shaukat, K., Hameed, I. A., Luo, S., Sarwar, M. U., Shabbir, S., Li, J., & Khushi, M. (2020). An investigation of credit card default prediction in the imbalanced datasets. IEEE Access, 8, 201173–201198. https://doi.org/10.1109/ACCESS.2020.3033784
2. Altman, E. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The Journal of Finance, 23, 589–609. https://doi.org/10.1111/J.1540-6261.1968.TB00843.X
3. Bank for International Settlements. (2000). Principles for the management of credit risk. Retrieved from http://www.bis.org/publ/bcbs75.pdf
4. Bank of England. (2022). Machine learning in UK financial services. Retrieved from https://www.bankofengland.co.uk/report/2022/machine-learning-in-uk-financial-services
5. De Castro Vieira, J., Barboza, F., Sobreiro, V., & Kimura, H. (2019). Machine learning models for credit analysis improvements: Predicting low-income families’ default. Applied Soft Computing, 83, 105640. https://doi.org/10.1016/j.asoc.2019.105640