Affiliation:
1. School of Information Management and Artificial Intelligence, Zhejiang University of Finance and Economics, Hangzhou, China
Abstract
With the advancement of machine learning, credit scoring can be performed better. As one of the widely recognized machine learning methods, ensemble learning has demonstrated significant improvements in the predictive accuracy over individual machine learning models for credit scoring. This study proposes a novel multi-stage ensemble model with multiple K-means-based selective undersampling for credit scoring. First, a new multiple K-means-based undersampling method is proposed to deal with the imbalanced data. Then, a new selective sampling mechanism is proposed to select the better-performing base classifiers adaptively. Finally, a new feature-enhanced stacking method is proposed to construct an effective ensemble model by composing the shortlisted base classifiers. In the experiments, four datasets with four evaluation indicators are used to evaluate the performance of the proposed model, and the experimental results prove the superiority of the proposed model over other benchmark models.
Subject
Artificial Intelligence,General Engineering,Statistics and Probability
Reference43 articles.
1. On voting-based consensus of cluster ensembles;Ayad;Pattern Recognition,2010
2. Bagging predictors;Breiman;Machine Learning,1996
3. Random forests;Breiman;Machine Learning,2001
4. Breiman L. , Friedman J. , Stone C.J. and Olshen R.A. , Classification and Regression Trees, CRC Press. (1984).
5. Brodersen K.H. , Ong C.S. , Stephan K.E. and Buhmann J.M. , The balanced accuracy and its posterior distribution, In Proceedings of the 20th International Conference on Pattern Recognition, Istanbul, Turkey, (2010), 3121–3124.
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献