Affiliation:
1. Department of Applied Informatics, Vytautas Magnus University, Kaunas, Lithuania
2. Silesian University of Technology, Gliwice, Poland
Abstract
Financial distress identification remains an essential topic in the scientific literature due to its importance for society and the economy. The advancements in information technology and the escalating volume of stored data have led to the emergence of financial distress that transcends the realm of financial statements and its’ indicators (ratios). The feature space could be expanded by incorporating new perspectives on feature data categories such as macroeconomics, sectors, social, board, management, judicial incident, etc. However, the increased dimensionality results in sparse data and overfitted models. This study proposes a new approach for efficient financial distress classification assessment by combining dimensionality reduction and machine learning techniques. The proposed framework aims to identify a subset of features leading to the minimization of the loss function describing the financial distress in an enterprise. During the study, 15 dimensionality reduction techniques with different numbers of features and 17 machine-learning models were compared. Overall, 1,432 experiments were performed using Lithuanian enterprise data covering the period from 2015 to 2022. Results revealed that the artificial neural network (ANN) model with 30 ranked features identified using the Random Forest mean decreasing Gini (RF_MDG) feature selection technique provided the highest AUC score. Moreover, this study has introduced a novel approach for feature extraction, which could improve financial distress classification models.
Funder
Horizon Europe Widening Participation program - Teaming for Excellence 2022 (Centre of Excellence of AI for Sustainable Living and Working (SustAInLivWork) project) and from the European Union