1. Visualization and Interpretation of Latent Spaces for Controlling Expressive Speech Synthesis Through Audio Analysis
2. Style tokens: Unsupervised style modeling, control and transfer in end-to-end speech synthesis;Wang
3. Towards end-to-end prosody transfer for expressive speech synthesis with Tacotron;Skerry-Ryan
4. DurIAN-E: Duration informed attention network for expressive text-to-speech synthesis;Gu,2023
5. DurIAN: Duration Informed Attention Network for Speech Synthesis