Affiliation:
1. 1 University of Miskolc Egyetemvaros , 3515 Miskolc , Hungary
Abstract
Abstract
Human-like agents are becoming more and more common. However, the usefulness of these agents depends to a large extent on the naturalness of their movements. The classification procedure presented in this article aims to increase the naturalness of the head movements of human-like agents. The method is capable of estimating the vertical range of head movement from the speech sound alone, and thus allows a final phase amplitude correction of the generated head movements of virtual talking heads in order to increase naturalness. The advantage of the method, is that it does not require visual information, works for general subjects, its precision and effectiveness can be improved by defining further classes, and it can improve the naturalness of any head movement generation method’s output by a posterior amplitude scaling.
Reference21 articles.
1. Yehia, H. C., Kuratate, T., & Vatikiotis-Bateson, E. (2002). Linking facial animation, head motion and speech acoustics. Journal of phonetics, 30(3), 555–568.
2. Greenwood, D., Laycock, & S., Matthews, I. (2017). Predicting head pose from speech with a conditional variational autoencoder. Interspeech 2017, 3991-3995.
3. Czap, L., & Kilik, R. (2015). Automatic gesture generation. Production Systems and Information Engineering, 7, 5–14.
4. Zhou Y., Han X., Shechtman E., Echevarria j., Kalogerakis E., & Li D. (2020). MakeltTalk: speaker-aware talking-head animation. ACM Transactions on Graphics (TOG) 39, 6, 1–15
5. Kim, H., Garrido, P., Tewari, A., Xu, W., Thies, J., Niessner, M., ... & Theobalt, C. (2018). Deep video portraits. ACM Transactions on Graphics (TOG), 37(4), 1-14.