Abstract
The performance of all learning-based group emotion recognition (GER) methods depends on the number of labeled samples. Although there are lots of group emotion images available on the Internet, labeling them manually is a labor-intensive and cost-expensive process. For this reason, datasets for GER are usually small in size, which limits the performance of GER. Considering labeling manually is challenging, using limited labeled images and a large number of unlabeled images in the network training is a potential way to improve the performance of GER. In this work, we propose a semi-supervised group emotion recognition framework based on contrastive learning to learn efficient features from both labeled and unlabeled images. In the proposed method, the unlabeled images are used to pretrain the backbone by a contrastive learning method, and the labeled images are used to fine-tune the network. The unlabeled images are then given pseudo-labels by the fine-tuned network and used for further training. In order to alleviate the uncertainty of the given pseudo-labels, we propose a Weight Cross-Entropy Loss (WCE-Loss) to suppress the influence of the samples with unreliable pseudo-labels in the training process. Experiment results on three prominent benchmark datasets for GER show the effectiveness of the proposed framework and its superiority compared with other competitive state-of-the-art methods.
Funder
National Natural Science Foundation of China
Science and Technology Program of Guangzhou, China
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference41 articles.
1. Barsade, S.G., and Gibson, D.E. (2008). Group Emotion: A View from Top and Bottom, Research on Managing Groups and Teams, JAI Press Inc.
2. Dhall, A., Asthana, A., and Goecke, R. (2010, January 6). Facial Expression Based Automatic Album Creation. Proceedings of the International Conference on Neural Information Processing, Sydney, Australia.
3. Meftah, I.T., Le Thanh, N., and Amar, C.B. (2012, January 5–6). Detecting Depression Using Multimodal Approach of Emotion Recognition. Proceedings of the 2012 IEEE International Conference on Complex Systems (ICCS), Agadir, Morocco.
4. Image memorability prediction using depth and motion cues;Basavaraju;IEEE Trans. Comput. Soc. Syst.,2020
5. Khosla, A., Raju, A.S., Torralba, A., and Oliva, A. (2015, January 7–13). Understanding and Predicting Image Memorability at a Large Scale. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献