Abstract
AbstractRecognizing facial expressions is a challenging task both for computers and humans. Although recent deep learning-based approaches are achieving high accuracy results in this task, research in this area is mainly focused on improving results using a single dataset for training and testing. This approach lacks generality when applied to new images or when using it in in-the-wild contexts due to diversity in humans (e.g., age, ethnicity) and differences in capture conditions (e.g., lighting or background). The cross-datasets approach can overcome these limitations. In this work we present a method to combine multiple datasets and we conduct an exhaustive evaluation of a proposed system based on a CNN analyzing and comparing performance using single and cross-dataset approaches with other architectures. Results using the proposed system ranged from 31.56% to 61.78% when used in a single-dataset approach with different well-known datasets and improved up to 73.05% when using a cross-dataset approach. Finally, to study the system and humans’ performance in facial expressions classification, we compare the results of 253 participants with the system. Results show an 83.53% accuracy for humans and a correlation exists between the results obtained by the participants and the CNN.
Funder
Ministerio de Ciencia e Innovación
Ministerio de Ciencia, Innovación y Universidades
Publisher
Springer Science and Business Media LLC
Subject
Computer Networks and Communications,Hardware and Architecture,Media Technology,Software
Cited by
19 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献