Author:
Zhi Yueping,Tao Xiangxing,Ji Yanting
Abstract
The objective of Chinese financial securities named entity recognition is to extract relevant entities from unstructured Chinese text, such as news, announcements, and research reports, that impact security prices. Recognizing entities in this field is challenging due to the abundance of specialized terms, diverse expressions, and the limited feature extraction capabilities of traditional models. To address this, we propose MFF-CNER, a multi-feature fusion model, to improve the effectiveness of Chinese financial securities named entity recognition. MFF-CNER encompasses several key steps. Firstly, it leverages a BERT pre-training model to capture semantic features at the character level. Secondly, a BiLSTM network is utilized to capture contextual features specific to financial securities text. Additionally, we introduce an Iterated Dilated Convolutional Neural Network (IDCNN) to blend, and extract local features, incorporating an Attention mechanism for weighted feature integration. Finally, the predicted sequences are optimized, and decoded using the Conditional Random Field (CRF). To validate the state-of-the-art performance of MFF-CNER in this domain, we compare it with five popular methods on a Chinese financial securities dataset annotated with the BIO labeling scheme. Notably, MFF-CNER demonstrates superior performance while maintaining compatibility among its components. Furthermore, we evaluate the applicability of MFF-CNER in the Chinese financial securities domain by utilizing public datasets from diverse domains, including social media (WEIBO), and news (MSRA). This research holds practical significance for downstream applications, such as constructing financial securities knowledge graphs, and analyzing factors that influence security prices.
Publisher
Darcy & Roy Press Co. Ltd.
Reference35 articles.
1. Sharnagat, R., Named entity recognition: A literature survey. Center For Indian Language Technology 2014, 1-27.W.-K. Chen, Linear Networks and Systems (Book style). Belmont, CA: Wadsworth, 1993, pp. 123–135.
2. Jayakumar, H.; Krishnakumar, M. S.; Peddagopu, V. V. V.; Sridhar, R., RNN based question answer generation and ranking for financial documents using financial NER. Sādhanā 2020, 45, 1-10.
3. Lamm, M.; Palomaki, J.; Alberti, C.; Andor, D.; Choi, E.; Soares, L. B.; Collins, M., Qed: A framework and dataset for explanations in question answering. Transactions of the Association for computational Linguistics 2021, 9, 790-806.
4. Araújo, M.; Pereira, A.; Benevenuto, F., A comparative study of machine translation for multilingual sentence-level sentiment analysis. Information Sciences 2020, 512, 1078-1102.
5. Rubino, R.; Fujita, A.; Marie, B. In Error identification for machine translation with metric embedding and attention, Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems, 2021; pp 146-156.