Author:
Wang Chunhui,Zeng Chang,Chen Jun,Xue Ouyang
Publisher
Springer Nature Singapore
Reference28 articles.
1. Lu, C., Zhang, P., Yan, Y.: Self-attention based prosodic boundary prediction for Chinese speech synthesis. In: ICASSP, pp. 7035–7039. IEEE (2019)
2. Yang, B., Zhong, J., Liu, S.: Pre-trained text representations for improving front-end text processing in mandarin text-to-speech synthesis. In: INTERSPEECH, pp. 4480–4484 (2019)
3. Shen, J., et al.: Natural TTS synthesis by conditioning WaveNet on Mel spectrogram predictions. In: ICASSP, pp. 4779–4783. IEEE (2018)
4. Ren, Y., et al.: Fastspeech: fast, robust and controllable text to speech. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
5. Yamamoto, R., Song, E., Kim, J.-M.: Parallel WaveGAN: a fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram. In: ICASSP, pp. 6199–6203. IEEE (2020)
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14