An effective multi-modal adaptive contextual feature information fusion method for Chinese long text classification-Reference-Cited by-同舟云学术

An effective multi-modal adaptive contextual feature information fusion method for Chinese long text classification

Published:2024-08-06 Issue:9 Volume:57 Page:
ISSN:1573-7462
Container-title:Artificial Intelligence Review
language:en
Short-container-title:Artif Intell Rev

Author:

Xu Yangshuyi,Liu Guangzhong,Zhang Lin,Shen Xiang,Luo Sizhe

Abstract

AbstractChinese long text classification plays a vital role in Natural Language Processing. Compared to Chinese short texts, Chinese long texts contain more complex semantic feature information. Furthermore, the distribution of these semantic features is uneven due to the varying lengths of the texts. Current research on Chinese long text classification models primarily focuses on enhancing text semantic features and representing Chinese long texts as graph-structured data. Nonetheless, these methods are still susceptible to noise information and tend to overlook the deep semantic information in long texts. To address the above challenges, this study proposes a novel and effective method called MACFM, which introduces a deep feature information mining method and an adaptive modal feature information fusion strategy to learn the semantic features of Chinese long texts thoroughly. First, we present the DCAM module to capture complex semantic features in Chinese long texts, allowing the model to learn detailed high-level representation features. Then, we explore the relationships between word vectors and text graphs, enabling the model to capture abundant semantic information and text positional information from the graph. Finally, we develop the AMFM module to effectively combine different modal feature representations and eliminate the unrelated noise information. The experimental results on five Chinese long text datasets show that our method significantly improves the accuracy of Chinese long text classification tasks. Furthermore, the generalization experiments on five English datasets and the visualized results demonstrate the effectiveness and interpretability of the MACFM model.

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s10462-024-10835-x.pdf

Reference66 articles.

1. Aras AC, Alikasifoglu T, Koç A (2024) Graph receptive transformer encoder for text classification. IEEE Trans Signal Inf Process Netw 10:347–359

2. Arevalo J, Solorio T, Montes-y Gomez M et al (2020) Gated multimodal networks. Neural Comput Appl 32:10209–10228

3. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint. arXiv:1409.0473

4. Bhatti UA, Tang H, Wu G et al (2023) Deep learning with graph convolutional networks: an overview and latest applications in computational intelligence. Int J Intell Syst 2023:1–28

5. Brown T, Mann B, Ryder N et al (2020) Language models are few-shot learners. Adv Neural Inf Process Syst 33:1877–1901