1. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions
2. Grad-TTS: A diffusion probabilistic model for text-to-speech;Popov
3. NaturalSpeech: End-to-End Text-to-Speech Synthesis with Human-Level Quality_supp1-3356232.pdf
4. Glow-TTS: A generative flow for text-to-speech via monotonic alignment search;Kim;Advances in Neural Information Processing Systems,2020
5. Conditional variational autoencoder with adversarial learning for end-to-end text-to-speech;Kim