1. Hunt, A.J., Black, A.W.: Unit selection in a concatenative speech synthesis system using a large speech database. In: 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, pp. 373–376 (1996)
2. Hamon, C., Mouline, E., Charpentier, F.: A diphone synthesis system based on time-domain prosodic modifications of speech. In: International Conference on Acoustics, Speech, and Signal Processing, pp. 238–241 (1989)
3. Tokuda, K., Nankaku, Y., Toda, T., Zen, H., Yamagishi, J., Oura, K.: Speech synthesis based on hidden Markov models. In: Proceedings of the IEEE, pp. 1234–1252 (2013)
4. Yu, K., Young, S.: Continuous F0 modeling for HMM based statistical parametric speech synthesis. In: IEEE Transactions on Audio, Speech, and Language Processing, pp. 1071–1079 (2011)
5. van den Oord, A., et al.: WaveNet: a generative model for raw audio. CoRR arXiv:1609.03499 (2016)