Abstract
AbstractTextual-based factors have been widely regarded as a promising feature that can be applied to financial issues. This study focuses on extracting both basic and semantic textual features to supplement the traditionally used financial indicators. The main is to improve Chinese listed companies’ financial distress prediction (FDP). A unique paradigm is proposed in this study that combines financial and multi-type textual predictive factors, feature selection methods, classifiers, and time spans to achieve the optimal FDP. The frequency counts, TF-IDF, TextRank, and word embedding approaches are employed to extract frequency count-based, keyword-based, sentiment, and readability indicators. The experimental results prove that financial domain sentiment lexicons, word embedding-based readability analysis approaches, and the basic textual features of Management Discussion and Analysis can be important elements of FDP. Moreover, the finding highlights the fact that incorporating financial and textual features can achieve optimal performance 4 or 5 years before the expected baseline year; applying the RF-GBDT combined model can also outperform other classifiers. This study makes an innovative contribution, since it expands the multiple text analysis method in the financial text mining field and provides new findings on how to provide early warning signs related to financial risk. The approaches developed in this research can serve as a template that can be used to resolve other financial issues.
Funder
National Natural Science Foundation of China
Opening Foundation of Early Warning and Emergency Linkage Technology Hubei Provincial Collaborative Innovation Center
Publisher
Springer Science and Business Media LLC
Subject
Computational Mathematics,General Computer Science
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献