Tailor Versatile Multi-Modal Learning for Multi-Label Emotion Recognition-Reference-Cited by-同舟云学术

Tailor Versatile Multi-Modal Learning for Multi-Label Emotion Recognition

Published:2022-06-28 Issue:8 Volume:36 Page:9100-9108
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Zhang Yi,Chen Mingyuan,Shen Jundong,Wang Chongjun

Abstract

Multi-modal Multi-label Emotion Recognition (MMER) aims to identify various human emotions from heterogeneous visual, audio and text modalities. Previous methods mainly focus on projecting multiple modalities into a common latent space and learning an identical representation for all labels, which neglects the diversity of each modality and fails to capture richer semantic information for each label from different perspectives. Besides, associated relationships of modalities and labels have not been fully exploited. In this paper, we propose versaTile multi-modAl learning for multI-labeL emOtion Recognition (TAILOR), aiming to refine multi-modal representations and enhance discriminative capacity of each label. Specifically, we design an adversarial multi-modal refinement module to sufficiently explore the commonality among different modalities and strengthen the diversity of each modality. To further exploit label-modal dependence, we devise a BERT-like cross-modal encoder to gradually fuse private and common modality representations in a granularity descent way, as well as a label-guided decoder to adaptively generate a tailored representation for each label with the guidance of label semantics. In addition, we conduct experiments on the benchmark MMER dataset CMU-MOSEI in both aligned and unaligned settings, which demonstrate the superiority of TAILOR over the state-of-the-arts.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 20 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Triple disentangled representation learning for multimodal affective analysis;Information Fusion;2025-02

2. A Two-Stage Multi-Modal Multi-Label Emotion Recognition Decision System Based on GCN;International Journal of Decision Support System Technology;2024-08-16

3. FDR-MSA: Enhancing multimodal sentiment analysis through feature disentanglement and reconstruction;Knowledge-Based Systems;2024-08

4. Multi-modal graph context extraction and consensus-aware learning for emotion recognition in conversation;Knowledge-Based Systems;2024-08

5. A Versatile Multimodal Learning Framework for Zero-Shot Emotion Recognition;IEEE Transactions on Circuits and Systems for Video Technology;2024-07