Author:
Kyeong Sunghyon,Shin Jinho
Abstract
AbstractCommercial banks are required to explain the credit evaluation results to their customers. Therefore, banks attempt to improve the performance of their credit scoring models while ensuring the interpretability of the results. However, there is a tradeoff between the logistic regression model and machine learning-based techniques regarding interpretability and model performance because machine learning-based models are a black box. To deal with the tradeoff, in this study, we present a two-stage logistic regression method based on the Bayesian approach. In the first stage, we generate the derivative variables by linearly combining the original features with their explanatory powers based on the Bayesian inference. The second stage involves developing a credit scoring model through logistic regression using these derivative variables. Through this process, the explanatory power of a large number of original features can be utilized for default prediction, and the use of logistic regression maintains the model's interpretability. In the empirical analysis, the independent sample t-test reveals that our proposed approach significantly improves the model’s performance compared to that based on the conventional single-stage approach, i.e., the baseline model. The Kolmogorov–Smirnov statistics show a 3.42 percentage points (%p) increase, and the area under the receiver operating characteristic shows a 2.61%p increase. Given that our two-stage modeling approach has the advantages of interpretability and enhanced performance of the credit scoring model, our proposed method is essential for those in charge of banking who must explain credit evaluation results and find ways to improve the performance of credit scoring models.
Publisher
Springer Science and Business Media LLC
Subject
Information Systems and Management,Computer Networks and Communications,Hardware and Architecture,Information Systems
Reference42 articles.
1. Khashei M, Mirahmadi A. A soft intelligent risk evaluation model for credit scoring classification. Int J Financ Stud. 2015;3:411–22.
2. Nurlybayeva K, Balakayeva G. Algorithmic scoring models. Appl Math Sci. 2013;7:571–86.
3. Walusala WS, Rimiru DR, Otieno DC. A hybrid machine learning approach for credit scoring using PCA and logistic regression. Int J Comput. 2017;27:84–102.
4. Dong G, Lai KK, Yen J. Credit scorecard based on logistic regression with random coefficients. Procedia Comput Sci. 2010;1:2463–8.
5. Chen C, Lin K, Rudin C, Shaposhnik Y, Wang S, Wang T. An interpretable model with globally consistent explanations for credit risk. Comput Res Repos. 2018;abs/1811.1. http://dblp.uni-trier.de/db/journals/corr/corr1811.html#abs-1811-12615
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献