1. 1) "Tohoku Kiritan singing database for researchers (in Japanese)," https://zunko.jp/kiridev/login.php (accessed 11 Oct. 2020).
2. 2) H. Zen, K. Tokuda and A. W. Black, "Statistical parametric speech synthesis," Speech Commun., 51, 1039–1064 (2009).
3. 3) H. Dudley, "Remaking speech," J. Acoust. Soc. Am., 11, 169–177 (1939).
4. 4) H. Zen, T. Nose, J. Yamagishi, S. Sako, T. Masuko, A. W. Black and K. Tokuda, "The HMM-based speech synthesis system (HTS) version 2.0," Proc. ISCA SSW6, pp. 294–299 (2007).
5. 5) H. Zen, A. Senior and M. Schuster, "Statistical parametric speech synthesis using deep neural networks," Proc. ICASSP, pp. 7962–7966 (2013).