Empirical Analysis of Ensemble Learning for Imbalanced Credit Scoring Datasets: A Systematic Review-Reference-Cited by-同舟云学术

Empirical Analysis of Ensemble Learning for Imbalanced Credit Scoring Datasets: A Systematic Review

Published:2022-06-15 Issue: Volume:2022 Page:1-18
ISSN:1530-8677
Container-title:Wireless Communications and Mobile Computing
language:en
Short-container-title:Wireless Communications and Mobile Computing

Author:

Lenka Sudhansu R.¹^ORCID,Bisoy Sukant Kishoro¹,Priyadarshini Rojalina¹,Sain Mangal²^ORCID

Affiliation:

1. Department of CSE, C.V. Raman Global University, Bhubaneswar, India

2. Division of Computer Engineering, Dongseo University, Busan 47011, Republic of Korea

Abstract

Credit scoring analysis has gained tremendous importance for researchers and the financial industries around the globe. It helps the financial industries to grant credits or loans to each deserving applicant with zero or minimal risks. However, developing an accurate and effective credit scoring model is a challenging task due to class imbalance and the presence of some irrelevant features. Recent researches show that ensemble learning has achieved supremacy in this field. In this paper, we performed an extensive comparative analysis of ensemble algorithms to bring further improvements in the algorithm oversampling, and feature selection (FS) techniques are implemented. The relevant features are identified by utilizing three FS techniques, such as information gain (IG), principal component analysis (PCA), and genetic algorithm (GA). Additionally, a comparative performance analysis is performed using 5 base and 14 ensemble models on three credit scoring datasets. The experimental results exhibit that the GA-based FS technique and CatBoost algorithm perform significantly better than other models in terms of five metrics, i.e., accuracy (ACC), area under the curve (AUC), F1-score, Brier score (BS), and Kolmogorov-Smirnov (KS).

Funder

Dongseo University

Publisher

Hindawi Limited

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Information Systems

Link

http://downloads.hindawi.com/journals/wcmc/2022/6584352.pdf

Reference71 articles.

1. Hybrid credit scoring model using neighborhood rough set and multi-layer ensemble classification

2. Performance Comparison of Multiple Discriminant Analysis and Logit Models in Bankruptcy Prediction

3. Credit Scoring Model Based on the Decision Tree and the Simulated Annealing Algorithm