Author:
Niinuma Koichiro,Onal Ertugrul Itir,Cohn Jeffrey F.,Jeni László A.
Abstract
The performance of automated facial expression coding is improving steadily. Advances in deep learning techniques have been key to this success. While the advantage of modern deep learning techniques is clear, the contribution of critical design choices remains largely unknown, especially for facial action unit occurrence and intensity across pose. Using the The Facial Expression Recognition and Analysis 2017 (FERA 2017) database, which provides a common protocol to evaluate robustness to pose variation, we systematically evaluated design choices in pre-training, feature alignment, model size selection, and optimizer details. Informed by the findings, we developed an architecture that exceeds state-of-the-art on FERA 2017. The architecture achieved a 3.5% increase in F1 score for occurrence detection and a 5.8% increase in Intraclass Correlation (ICC) for intensity estimation. To evaluate the generalizability of the architecture to unseen poses and new dataset domains, we performed experiments across pose in FERA 2017 and across domains in Denver Intensity of Spontaneous Facial Action (DISFA) and the UNBC Pain Archive.
Funder
Fujitsu Laboratories of America
Reference38 articles.
1. Support vector regression of sparse dictionary-based features for view-independent action unit intensity estimation;Amirian,2017
2. Cross-dataset learning and person-specific normalisation for automatic action unit detection;Baltrušaitis,2015
3. AUMPNet: simultaneous action units detection and intensity estimation on multipose facial images using a single convolutional neural network;Batista,2017
4. Emotion modelling for social robotics applications: a review;Cavallo;J. Bionic Eng,2018
5. Learning facial action units with spatiotemporal cues and multi-label sampling;Chu;Image Vision Comput,2019
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献