Author:
Stan Adriana,Lőrincz Beáta
Abstract
This chapter introduces an overview of the current approaches for generating spoken content using text-to-speech synthesis (TTS) systems, and thus the voice of an Interactive Virtual Assistant (IVA). The overview builds upon the issues which make spoken content generation a non-trivial task, and introduces the two main components of a TTS system: text processing and acoustic modelling. It then focuses on providing the reader with the minimally required scientific details of the terminology and methods involved in speech synthesis, yet with sufficient knowledge so as to be able to make the initial decisions regarding the choice of technology for the vocal identity of the IVA. The speech synthesis methodologies’ description begins with the basic, easy to run, low-requirement rule-based synthesis, and ends up within the state-of-the-art deep learning landscape. To bring this extremely complex and extensive research field closer to commercial deployment, an extensive indexing of the readily and freely available resources and tools required to build a TTS system is provided. Quality evaluation methods and open research problems are, as well, highlighted at end of the chapter.
Reference112 articles.
1. J. Benesty, M. M. Sondhi, and Y. A. Huang, Springer Handbook of Speech Processing. Secaucus, NJ, USA: Springer-Verlag New York, Inc., 2007
2. P. Taylor, Text-to-Speech Synthesis. Cambridge University Press, 2009
3. “Missing fundamental,” en.wikipedia.org/wiki/ Missing fundamental, online; accessed 15-December-2020
4. S. King, “Speech Zone - Windowing,” speech.zone/windowing/, online; accessed 15-December-2020
5. “Fourier analysis,” en.wikipedia.org/wiki/Fourier analysis, online; accessed 15-December-2020
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. The Role of Vocal Persona in Natural and Synthesized Speech;2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG);2023-01-05
2. On the Potential of Modular Voice Conversion for Virtual Agents;2021 9th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW);2021-09-28