Affiliation:
1. Bucharest University of Economic Studies , Bucharest , Romania
Abstract
Abstract
Machine Learning is a constantly growing area which has the capacity to analyze massive amounts of data and find relevant patterns, a very important feature in the era of big data. It has a wide range of application areas, including the financial field, and proved to be efficient in solving various problems, including the prediction of the default probability of a customer to meet their obligations to the bank, using classification algorithms. Their output is further used when deciding whether to approve a loan or no, based on the previous behavior of the customers, hence reduces the loss of the bank. Even though Machine Learning algorithms proved to be efficient in solutioning this type of problems, none was identified for remarkable results. This paper studies 10 different methods applied on the same dataset (Logistic Regression, K-Nearest Neighbor, Support Vector Machine, Kernel Support Vector Machine, Naïve Bayes, Decision Tree, Random Forest, Bagging Classifier, Linear Discriminant Analysis, Neural Network - Multi Layer Perceptron) and performs a comparative analysis aiming to identify the one which outperforms the others. Their performance is evaluated based on some well-known statistical measures such as Accuracy, Misclassification Rate, Precision and Specificity. In addition, this paper also presents and evaluates the impact of feature selection on the overall performance of an algorithm.
Subject
General Earth and Planetary Sciences,General Environmental Science
Reference16 articles.
1. Antal-Vaida, C. (2020). Business Analytics Applications for Consumer Credits, Database System Journal, 14-23.
2. Bellotti, T., & Crook, J. (2009). Support vector machines for credit scoring and discovery of significant features. ScienceDirect - Expert Systems with Applications, 36, 3302-3308.10.1016/j.eswa.2008.01.005
3. Doukidis, G., Mylonopoulos, N., & Pouloudi, N. (2004), Social and Economic Transformation in the Digital Era, IGI Global.10.4018/978-1-59140-158-2
4. Dwight, M. (2013). A framework for Applying Analytics in Healthcare – What can be Learned from the Best Practices in Retail, Banking, Politics and Sports, Pearson Education Inc.
5. Ha, V.-S., & Nguyen, H.-N. (2016). Credit scoring with a feature selection approach based deep learning. MATEC Web of Conference. 54. Cape Town, South Africa: EDP Sciences.