Abstract
For the aspect-based sentiment analysis task, traditional works are only for text modality. However, in social media scenarios, texts often contain abbreviations, clerical errors, or grammatical errors, which invalidate traditional methods. In this study, the cross-model hierarchical interactive fusion network incorporating an end-to-end approach is proposed to address this challenge. In the network, a feature attention module and a feature fusion module are proposed to obtain the multimodal interaction feature between the image modality and the text modality. Through the attention mechanism and gated fusion mechanism, these two modules realize the auxiliary function of image in the text-based aspect-based sentiment analysis task. Meanwhile, a boundary auxiliary module is used to explore the dependencies between two core subtasks of the aspect-based sentiment analysis. Experimental results on two publicly available multi-modal aspect-based sentiment datasets validate the effectiveness of the proposed approach.
Reference38 articles.
1. MEDT: Using multimodal encoding-decoding network as in transformer for multimodal sentiment analysis;Qi;IEEE Access,2022
2. G. Chandrasekaran, T.N. Nguyen and D.J. Hemanth, Multimodal sentimental analysis for social media applications: A comprehensive review, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 11 (2021).
3. Q.C. Li, A. Stefani, G. Toto et al., Towards Multimodal Sentiment Analysis Inspired by the Quantum Theoretical Framework, in: 2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), 2020. pp. 177–180.
4. N. Xu, W. Mao and G. Chen, A Co-Memory Network for Multimodal Sentiment Analysis, in: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 2018. pp. 177–180.
5. Emotion recognition from multiple modalities: Fundamentals and methodologies;Zhao;IEEE Signal Processing Magazine,2021