1. Z. Wu, T. Kinnunen, N. Evans, J. Yamagishi, C. Hanilçi, M. Sahidullah, and A. Sizov, Sixteenth Annual Conference of the International Speech Communication Association (2015).
2. Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016a. Wavenet: A generative model for raw audio by A Oord · 2016. arXiv preprint arXiv:1609.03499.
3. M.S.O. Arik, A. Chrzanowski, Coates, Gory Gre, A. Diamos, Y. Gibiansky, X. Kang, J. Li, A. Miller, J. Ng, and Raiman, Proceedings of the 34th International Conference on Machine Learning 70, 195–204(2017).
4. Y. Wang, D. Skerry-Ryan, Y. Stanton, R. J. Wu, Z. Y. Weiss, Y. Jaitly, Z. Xiao, S. Chen, Q. Bengio, and Le, Tacotron: Towards end-to-end speech synthesis, edited by Interspeech (2017).
5. The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods