Abstract
Supported by artificial intelligence (AI), recently, two different topics have taken an essential role in many applications, which are chatbots and virtual humans, owing to their capability in establishing communication between them and the users for accomplishing different tasks depending on the purpose they were built for. Virtual humans are getting a lot of attention in different industries due to their realistic human form, behavior, and ability to convey emotional feedback, especially when experienced in a virtual reality environment. On the other hand, Chatbots are considered the most promising example of building interaction between humans and machines because of their high efficiency in communicating with people resulting in being utilized in various applications. Thus, combining a chatbot with a virtual human that acts as an average human will draw people's attention; hence it will achieve positive feedback because face-to-face communication has always played a main role in how people interact and develop throughout history. Therefore, we present an Open-Domain Conversational Digital Human System that allows you to have a friendly virtual avatar and establishes realistic interaction with users. The system consists of a 3D virtual character and a set of artificial intelligence models, each specified for completing a task like emotion recognition, dialogue generation, facial expression extraction, animations, text-to-speech, and speech-to-text conversion.
Reference40 articles.
1. Robert PH, König A, Amieva H, Andrieu S, Bremond F, Bullock R, Ceccaldi M, Dubois B, Gauthier S, Konigsberg pa, nave s. recommendations for the use of serious games in people with Alzheimer’s disease, related disorders and frailty. Frontiers in Aging Neuroscience. 2014; 6:54.
2. Xiong W, Wu L, Alleva F, Droppo J, Huang X and Stolcke A. The Microsoft 2017 conversational speech recognition system. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 15-20 April 2018; Calgary, AB, Canada. pp. 5934-5938.
3. Skerry-Ryan RJ, Battenberg E, Xiao Y, Wang Y, Stanton D, Shor J, Weiss R, Clark R, Saurous RA. Towards end-to-end prosody transfer for expressive speech synthesis with tacotron. In: International Conference on Machine Learning; 10-15 Jul 2018; Stockholm, Sweden. pp. 4693-4702.
4. Zhang WE, Sheng QZ, Alhazmi A, Li C. Adversarial attacks on deep-learning models in natural language processing: A survey. ACM Transactions on Intelligent Systems and Technology (TIST). 2020 Apr 1;11(3):1-41.
5. Hyneman W, Itokazu H, Williams L, Zhao X. Human face project. In: ACM Siggraph 2005 Courses; 31 Jul 2005; Los Angeles, CA, USA: pp. 5-es.