Residual multimodal Transformer for expression‐EEG fusion continuous emotion recognition-Reference-Cited by-同舟云学术

Residual multimodal Transformer for expression‐EEG fusion continuous emotion recognition

Published:2024-05-08 Issue: Volume: Page:
ISSN:2468-2322
Container-title:CAAI Transactions on Intelligence Technology
language:en
Short-container-title:CAAI Trans on Intel Tech

Author:

Jin Xiaofang¹,Xiao Jieyu¹^ORCID,Jin Libiao¹,Zhang Xinruo²

Affiliation:

1. College of Information and Communication Engineering Communication University of China Beijing China

2. School of Computer Science and Electronic Engineering University of Essex Colchester UK

Abstract

AbstractContinuous emotion recognition is to predict emotion states through affective information and more focus on the continuous variation of emotion. Fusion of electroencephalography (EEG) and facial expressions videos has been used in this field, while there are with some limitations in current researches, such as hand‐engineered features, simple approaches to integration. Hence, a new continuous emotion recognition model is proposed based on the fusion of EEG and facial expressions videos named residual multimodal Transformer (RMMT). Firstly, the Resnet50 and temporal convolutional network (TCN) are utilised to extract spatiotemporal features from videos, and the TCN is also applied to process the computed EEG frequency power to acquire spatiotemporal features of EEG. Then, a multimodal Transformer is used to fuse the spatiotemporal features from the two modalities. Furthermore, a residual connection is introduced to fuse shallow features with deep features which is verified to be effective for continuous emotion recognition through experiments. Inspired by knowledge distillation, the authors incorporate feature‐level loss into the loss function to further enhance the network performance. Experimental results show that the RMMT reaches a superior performance over other methods for the MAHNOB‐HCI dataset. Ablation studies on the residual connection and loss function in the RMMT demonstrate that both of them is functional.

Publisher

Institution of Engineering and Technology (IET)

Reference38 articles.

1. An argument for basic emotions

2. Learning Dynamic Relationships for Facial Expression Recognition Based on Graph Convolutional Network

3. Action unit analysis enhanced facial expression recognition by deep neural network evolution

4. Emotion Expression With Fact Transfer for Video Description