Abstract
PurposeThe purpose of this paper is to compare nine different models to evaluate consumer credit risk, which are the following: Logistic Regression (LR), Naive Bayes (NB), Linear Discriminant Analysis (LDA), k-Nearest Neighbor (k-NN), Support Vector Machine (SVM), Classification and Regression Tree (CART), Artificial Neural Network (ANN), Random Forest (RF) and Gradient Boosting Decision Tree (GBDT) in Peer-to-Peer (P2P) Lending.Design/methodology/approachThe author uses data from P2P Lending Club (LC) to assess the efficiency of a variety of classification models across different economic scenarios and to compare the ranking results of credit risk models in P2P lending through three families of evaluation metrics.FindingsThe results from this research indicate that the risk classification models in the 2013–2019 economic period show greater measurement efficiency than for the difficult 2007–2012 period. Besides, the results of ranking models for predicting default risk show that GBDT is the best model for most of the metrics or metric families included in the study. The findings of this study also support the results of Tsai et al. (2014) and Teplý and Polena (2019) that LR, ANN and LDA models classify loan applications quite stably and accurately, while CART, k-NN and NB show the worst performance when predicting borrower default risk on P2P loan data.Originality/valueThe main contributions of the research to the empirical literature review include: comparing nine prediction models of consumer loan application risk through statistical and machine learning algorithms evaluated by the performance measures according to three separate families of metrics (threshold, ranking and probabilistic metrics) that are consistent with the existing data characteristics of the LC lending platform through two periods of reviewing the current economic situation and platform development.
Reference30 articles.
1. Predicting online peer-to-peer (P2P) lending default using data mining techniques,2018
2. An experimental comparison of classification algorithms for imbalanced credit scoring data sets;Expert Systems with Applications,2012
3. Predicting default risk of lending club,2015
4. Dinh, T.H.T., Kleimeier, S. and Straetmans, S.T.M. (2013), “Bank lending strategy, credit scoring and financial crises”, in Research Memoranda 053, Maastricht University, Graduate School of Business and Economics (GSBE), doi: 10.26481/umagsb.2013053.