Author:
Du Hui-Peng,Lu Ye-Xin,Ai Yang,Ling Zhen-Hua
Publisher
Springer Nature Singapore
Reference35 articles.
1. Oord, A.V.D., et al.: WaveNet: a generative model for raw audio. In: Proceedings of the SSW, p. 125 (2016)
2. Mehri, S., et al.: SampleRNN: an unconditional end-to-end neural audio generation model. In: Proceedings of the ICLR (2016)
3. Kawahara, H., Masuda-Katsuse, I., De Cheveigne, A.: Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds. Speech Commun. 27(3–4), 187–207 (1999)
4. Morise, M., Yokomori, F., Ozawa, K.: WORLD: a vocoder-based high-quality speech synthesis system for real-time applications. IEICE Trans. Inf. Syst. 99(7), 1877–1884 (2016)
5. Oord, A., et al.: Parallel WaveNet: fast high-fidelity speech synthesis. In: Proceedings of the ICML, pp. 3918–3926 (2018)