1. WANG Y , SKERRY-RYAN R J , STANTON D , Tacotron : Towards End-to-End Speech Synthesis [J]. 2017 . WANG Y, SKERRY-RYAN R J, STANTON D, Tacotron: Towards End-to-End Speech Synthesis [J]. 2017.
2. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions
3. REN Y , RUAN Y , TAN X , Fastspeech : Fast, robust and controllable text to speech [J]. Advances in neural information processing systems, 2019 , 32. REN Y, RUAN Y, TAN X, Fastspeech: Fast, robust and controllable text to speech [J]. Advances in neural information processing systems, 2019, 32.
4. REN Y , HU C , TAN X , Fastspeech 2: Fast and high-quality end-to-end text to speech [J]. arXiv preprint arXiv:200604558 , 2020 . REN Y, HU C, TAN X, Fastspeech 2: Fast and high-quality end-to-end text to speech [J]. arXiv preprint arXiv:200604558, 2020.
5. TAN X , QIN T , SOONG F , A survey on neural speech synthesis [J]. arXiv preprint arXiv:210615561 , 2021 . TAN X, QIN T, SOONG F, A survey on neural speech synthesis [J]. arXiv preprint arXiv:210615561, 2021.