Multimodal Sentiment Analysis Based on Cross-Modal Attention and Gated Cyclic Hierarchical Fusion Networks-Reference-Cited by-同舟云学术

Multimodal Sentiment Analysis Based on Cross-Modal Attention and Gated Cyclic Hierarchical Fusion Networks

Published:2022-08-09 Issue: Volume:2022 Page:1-12
ISSN:1687-5273
Container-title:Computational Intelligence and Neuroscience
language:en
Short-container-title:Computational Intelligence and Neuroscience

Author:

Quan Zhibang¹^ORCID,Sun Tao¹^ORCID,Su Mengli¹,Wei Jishu¹

Affiliation:

1. School of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China

Abstract

Multimodal sentiment analysis has been an active subfield in natural language processing. This makes multimodal sentiment tasks challenging due to the use of different sources for predicting a speaker’s sentiment. Previous research has focused on extracting single contextual information within a modality and trying different modality fusion stages to improve prediction accuracy. However, a factor that may lead to poor model performance is that this does not consider the variability between modalities. Furthermore, existing fusion methods tend to extract the representational information of individual modalities before fusion. This ignores the critical role of intermodal interaction information for model prediction. This paper proposes a multimodal sentiment analysis method based on cross-modal attention and gated cyclic hierarchical fusion network MGHF. MGHF is based on the idea of distribution matching, which enables modalities to obtain representational information with a synergistic effect on the overall sentiment orientation in the temporal interaction phase. After that, we designed a gated cyclic hierarchical fusion network that takes text-based acoustic representation, text-based visual representation, and text representation as inputs and eliminates redundant information through a gating mechanism to achieve effective multimodal representation interaction fusion. Our extensive experiments on two publicly available and popular multimodal datasets show that MGHF has significant advantages over previous complex and robust baselines.

Funder

National Basic Research Program of China

Publisher

Hindawi Limited

Subject

General Mathematics,General Medicine,General Neuroscience,General Computer Science

Link

http://downloads.hindawi.com/journals/cin/2022/4767437.pdf

Reference37 articles.

1. Modality to Modality Translation: An Adversarial Representation Learning and Graph Fusion Network for Multimodal Fusion

2. CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotation of Modality

3. US presidential election 2020 prediction based on Twitter data using lexicon-based sentiment analysis