Affiliation:
1. Modern Education Society, College of Engineering, Pune, India
Abstract
Speech driven facial animation can be regarded as a speech-to-face translation. Speech driven facial motion synthesis involves Speech analysis and face modeling. This method makes use of still image of a person and speech signals to produce an animation of a talking character. Our method makes use of GAN classifier to obtain better lip synchronizing with audio. GAN methodology also helps to obtain realistic facial expressions thereby making a talking character more effective. Factors such as lip-syncing accuracy, sharpness, and ability to create high -quality faces and natural blinks are taken into consideration by this system. GANs are mainly used in case of image generation as adversarial loss generates sharper and more depictive images. Along with images, GANs can also handle videos easily.
Reference19 articles.
1. Speech driven talking face generation from a single image and an emotion condition, Sefik Emre Eskimez, Member, IEEE, You Zhang, Student Member, IEEE, and Zhiyao Duan, Member, IEEE, 8 August 2020.
2. Arbitrary talking face generation via attentional audio-visual coherence learning, Hao Zhu, Huaibo Huang, Yi Li, Aihua Zheng and Ran He, School of Computer Science and Technology, Anhui University, 13 May 2020.
3. Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose Ran Yi, Zipeng Ye, Juyong Zhang, Member, IEEE, Hujun Bao, Member, IEEE, and Yong-Jin Liu, Senior Member, IEEE, 5 March 2020
4. Speech-driven facial animation using polynomial fusion of features, Triantafyllos Kefalas, Konstantinos Vougioukas, Yannis Panagakis, Stavros Petridis, Jean Kossaifi, Maja Pantic, Department of Computing, Imperial College London, UK, 19 Feb 2020.
5. Audio2face: generating speech/face animation from single audio with attention-based bidirectional LSTM networks, Guanzhong Tian, Yi Yuan, Yong Liu, Institute of Cyber - systems and Control, Zhejiang University, 27 May 2019.