Facial Expression Recognition Using Dual Path Feature Fusion and Stacked Attention-Reference-Cited by-同舟云学术

Facial Expression Recognition Using Dual Path Feature Fusion and Stacked Attention

Published:2022-08-30 Issue:9 Volume:14 Page:258
ISSN:1999-5903
Container-title:Future Internet
language:en
Short-container-title:Future Internet

Author:

Zhu Hongtao^ORCID,Xu Huahu,Ma Xiaojin,Bian Minjie

Abstract

Facial Expression Recognition (FER) can achieve an understanding of the emotional changes of a specific target group. The relatively small dataset related to facial expression recognition and the lack of a high accuracy of expression recognition are both a challenge for researchers. In recent years, with the rapid development of computer technology, especially the great progress of deep learning, more and more convolutional neural networks have been developed for FER research. Most of the convolutional neural performances are not good enough when dealing with the problems of overfitting from too-small datasets and noise, due to expression-independent intra-class differences. In this paper, we propose a Dual Path Stacked Attention Network (DPSAN) to better cope with the above challenges. Firstly, the features of key regions in faces are extracted using segmentation, and irrelevant regions are ignored, which effectively suppresses intra-class differences. Secondly, by providing the global image and segmented local image regions as training data for the integrated dual path model, the overfitting problem of the deep network due to a lack of data can be effectively mitigated. Finally, this paper also designs a stacked attention module to weight the fused feature maps according to the importance of each part for expression recognition. For the cropping scheme, this paper chooses to adopt a cropping method based on the fixed four regions of the face image, to segment out the key image regions and to ignore the irrelevant regions, so as to improve the efficiency of the algorithm computation. The experimental results on the public datasets, CK+ and FERPLUS, demonstrate the effectiveness of DPSAN, and its accuracy reaches the level of current state-of-the-art methods on both CK+ and FERPLUS, with 93.2% and 87.63% accuracy on the CK+ dataset and FERPLUS dataset, respectively.

Publisher

MDPI AG

Subject

Computer Networks and Communications

Link

https://www.mdpi.com/1999-5903/14/9/258/pdf

Reference59 articles.

1. Recognizing action units for facial expression analysis

2. The Expression of the Emotions in Man and Animals;Darwin,1998

3. Emotiw 2018: Audio-Video, student engagement and group-level affect prediction;Dhall;Proceedings of the 20th ACM International Conference on Multimodal Interaction,2018

4. Emotionet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild;Fabian Benitez-Quiroz;Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016

5. Assessing Demographic Bias Transfer from Dataset to Model: A Case Study in Facial Expression Recognition;Dominguez-Catena;arXiv,2022

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Improving Access Trust in Healthcare Through Multimodal Deep Learning for Affective Computing;Human-Centric Intelligent Systems;2024-08-29

2. An Intra-Class Ranking Metric for Remote Sensing Image Retrieval;Remote Sensing;2023-08-09

3. Developments of Computer Vision and Image Processing: Methodologies and Applications;Future Internet;2023-06-30