Author:
Chaturvedi Iti,Pandelea Vlad,Cambria Erik,Welsch Roy,Datta Bithin
Abstract
AbstractIn this paper, we target the problem of generating facial expressions from a piece of audio. This is challenging since both audio and video have inherent characteristics that are distinct from the other. Some words may have identical lip movements, and speech impediments may prevent lip-reading in some individuals. Previous approaches to generating such a talking head suffered from stiff expressions. This is because they focused only on lip movements and the facial landmarks did not contain the information flow from the audio. Hence, in this work, we employ spatio-temporal independent component analysis to accurately sync the audio with the corresponding face video. Proper word formation also requires control over the face muscles that can be captured using a barrier function. We first validated the approach on the diffusion of salt water in coastal areas using a synthetic finite element simulation. Next, we applied it to 3D facial expressions in toddlers for which training data is difficult to capture. Prior knowledge in the form of rules is specified using Fuzzy logic, and multi-objective optimization is used to collectively learn a set of rules. We observed significantly higher F-measure on three real-world problems.
Publisher
Springer Science and Business Media LLC
Reference36 articles.
1. Chung JS, Senior A, Vinyals O, Zisserman A. Lip reading sentences in the wild. In: CVPR. 2017. pp. 3444–3453.
2. Stappen L, Baird A, Cambria E, Schuller BW. Sentiment analysis and topic recognition in video transcriptions. IEEE Intell Syst. 2021;36(02):88–95.
3. Chaturvedi I, Noel T, Satapathy R. Speech emotion recognition using audio matching. Electronics. 2022;11(23).
4. Lu Y, Chai J, Cao X. Live speech portraits: real-time photorealistic talking-head animation. ACM Trans Graph. 2021;40(6).
5. Cambria E, Zhang X, Mao R, Chen M, Kwok K. Senticnet 8: Fusing emotion AI and commonsense AI for interpretable, trustworthy, and explainable affective computing. In: International Conference on Human-Computer Interaction. 2024.