1. Acoustic Features Modelling for Statistical Parametric Speech Synthesis: A Review;Adiga;IETE Tech. Rev.,2019
2. Arik, S., Diamos, G., Gibiansky, A., Miller, J., Peng, K., Ping, W., Raiman, J., Zhou, Y., 2017a. Deep Voice 2: Multi-Speaker Neural Text-to-Speech, p. arXiv:1705.08947.
3. Arik, S.O., Chrzanowski, M., Coates, A., Diamos, G., Gibiansky, A., Kang, Y., Li, X., Miller, J., Ng, A., Raiman, J., Sengupta, S., Shoeybi, M., 2017b. Deep Voice: Real-time Neural Text-to-Speech, p. arXiv:1702.07825.
4. Fast Spectrogram Inversion Using Multi-Head Convolutional Neural Networks;Arik;IEEE Signal Process Lett.,2019
5. Bińkowski, M., Donahue, J., Dieleman, S., Clark, A., Elsen, E., Casagrande, N., Cobo, L.C., Simonyan, K., 2019. High Fidelity Speech Synthesis with Adversarial Networks, p. arXiv:1909.11646.