A novel multi-stage ensemble model with multiple K-means-based selective undersampling: An application in credit scoring-Reference-Cited by-同舟云学术

A novel multi-stage ensemble model with multiple K-means-based selective undersampling: An application in credit scoring

Published:2021-04-22 Issue:5 Volume:40 Page:9471-9484
ISSN:1064-1246
Container-title:Journal of Intelligent & Fuzzy Systems
language:
Short-container-title:IFS

Author:

Jin Yilun¹,Liu Yanan¹,Zhang Wenyu¹,Zhang Shuai¹,Lou Yu¹

Affiliation:

1. School of Information Management and Artificial Intelligence, Zhejiang University of Finance and Economics, Hangzhou, China

Abstract

With the advancement of machine learning, credit scoring can be performed better. As one of the widely recognized machine learning methods, ensemble learning has demonstrated significant improvements in the predictive accuracy over individual machine learning models for credit scoring. This study proposes a novel multi-stage ensemble model with multiple K-means-based selective undersampling for credit scoring. First, a new multiple K-means-based undersampling method is proposed to deal with the imbalanced data. Then, a new selective sampling mechanism is proposed to select the better-performing base classifiers adaptively. Finally, a new feature-enhanced stacking method is proposed to construct an effective ensemble model by composing the shortlisted base classifiers. In the experiments, four datasets with four evaluation indicators are used to evaluate the performance of the proposed model, and the experimental results prove the superiority of the proposed model over other benchmark models.

Publisher

IOS Press

Subject

Artificial Intelligence,General Engineering,Statistics and Probability

Reference43 articles.

1. On voting-based consensus of cluster ensembles;Ayad;Pattern Recognition,2010

2. Bagging predictors;Breiman;Machine Learning,1996

3. Random forests;Breiman;Machine Learning,2001

4. Breiman L. , Friedman J. , Stone C.J. and Olshen R.A. , Classification and Regression Trees, CRC Press. (1984).

5. Brodersen K.H. , Ong C.S. , Stephan K.E. and Buhmann J.M. , The balanced accuracy and its posterior distribution, In Proceedings of the 20th International Conference on Pattern Recognition, Istanbul, Turkey, (2010), 3121–3124.

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A new hybrid credit scoring ensemble model with feature enhancement and soft voting weight optimization;Expert Systems with Applications;2024-03

2. A novel ensemble model of multi-class credit assessment based on multi-source fusion theory;Journal of Intelligent & Fuzzy Systems;2024-01-10

3. A Two-stage Clustering Undersampling for Class-overlapped Imbalanced Classification;2023 IEEE 9th International Conference on Cloud Computing and Intelligent Systems (CCIS);2023-08-12

4. Determining susceptible body parts of construction workers due to occupational injuries using inclusive modelling;Safety Science;2023-08

5. The personal credit default discrimination model based on DF21;Journal of Intelligent & Fuzzy Systems;2023-03-09