Affiliation:
1. School of Software, Henan University of Engineering, Zhengzhou, China
Abstract
Video emotion recognition has attracted increasing attention. Most existing approaches are based on the spatial features extracted from video frames. The context information and their relationships in videos are often ignored. Thus, the performance of existing approaches is restricted. In this study, we propose a sparse spatial-temporal emotion graph convolutional network-based video emotion recognition method (SE-GCN). For the spatial graph, the emotional relationship between any two emotion proposal regions is first calculated and the sparse spatial graph is constructed according to the emotional relationship. For the temporal graph, the emotional information contained in each emotion proposal region is first analyzed and the sparse temporal graph is constructed by using the emotion proposal regions with rich emotional cues. Then, the reasoning features of the emotional relationship are obtained by the spatial-temporal GCN. Finally, the features of the emotion proposal regions and the spatial-temporal relationship features are fused to recognize the video emotion. Extensive experiments are conducted on four challenging benchmark datasets, that is, MHED, HEIV, VideoEmotion-8, and Ekman-6. The experimental results demonstrate that the proposed method achieves state-of-the-art performance.
Funder
Foundation of Henan Educational Committee
Subject
General Mathematics,General Medicine,General Neuroscience,General Computer Science
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献