1. Kyubyong Park. 2018. KSS Dataset: Korean Single Speaker Speech Dataset. https://www.kaggle.com/bryanpark/korean-single-speaker-speech-dataset
2. Social eye gaze in human-robot interaction: A review;Admoni Henny;J. Hum.-Robot Interact.,2017
3. Chaitanya Ahuja, Dong Won Lee, Yukiko I. Nakano, and Louis-Philippe Morency. 2020. Style transfer for co-speech gesture animation: A multi-speaker conditional-mixture approach. In Proceedings of the European Conference on Computer Vision (ECCV’20), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer International Publishing, Cham, 248–265.
4. Niki Aifanti, Christos Papachristou, and Anastasios Delopoulos. 2010. The MUG facial expression database. In Proceedings of the 11th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS’10). IEEE, Desenzano del Garda, Italy, 1–4.
5. Style‐Controllable Speech‐Driven Gesture Synthesis Using Normalising Flows