1. Deep voice: Real-time neural text-to-speech;Arik,2017
2. Fastspeech 2: Fast and high-quality end-to-end text to speech;Ren,2021
3. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions
4. Glow-TTS: A generative flow for text-to-speech via monotonic alignment search;Kim,2020
5. Grad-TTS: A diffusion probabilistic model for text-to-speech;Popov,2021