Publisher
Springer Nature Singapore
Reference84 articles.
1. Chung YA, Wang Y, Hsu WN, Zhang Y, Skerry-Ryan R (2019) Semi-supervised training for improving data efficiency in end-to-end speech synthesis. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 6940–6944
2. Wang P, Qian Y, Soong FK, He L, Zhao H (2015) Word embedding for recurrent neural network based TTS synthesis. In: 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 4879–4883
3. Zhang M, Wang X, Fang F, Li H, Yamagishi J (2019) Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet. In: Proceedings of the Interspeech 2019, pp 1298–1302
4. Fang W, Chung YA, Glass J (2019) Towards transfer learning for end-to-end speech synthesis from deep pre-trained language models. Preprint. arXiv:1906.07307
5. Jia Y, Zen H, Shen J, Zhang Y, Wu Y (2021) PnG BERT: augmented BERT on phonemes and graphemes for neural TTS. Preprint. arXiv:2103.15060