A Generative Adversarial Network Based Ensemble Technique for Automatic Evaluation of Machine Synthesized Speech-Reference-Cited by-同舟云学术

A Generative Adversarial Network Based Ensemble Technique for Automatic Evaluation of Machine Synthesized Speech

Published:2020 Issue: Volume: Page:580-593
ISSN:0302-9743
Container-title:Lecture Notes in Computer Science
language:
Short-container-title:

Author:

Jaiswal Jaynil^ORCID,Chaubey Ashutosh^ORCID,Bhimavarapu Sasi Kiran Reddy^ORCID,Kashyap Shashank^ORCID,Kumar Puneet^ORCID,Raman Balasubramanian^ORCID,Roy Partha Pratim^ORCID

Publisher

Springer International Publishing

Link

http://link.springer.com/content/pdf/10.1007/978-3-030-41299-9_45

Reference26 articles.

1. Arik, S.Ö., et al.: Deep voice: real-time neural text-to-speech. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 195–204. JMLR.org (2017)

2. Wang, Y., et al.: Tacotron: towards end-to-end speech synthesis. arXiv preprint arXiv:1703.10135 (2017)

3. Ping, W.: Deep voice 3: scaling text-to-speech with convolutional sequence learning. arXiv preprint arXiv:1710.07654 (2017)

4. van den Oord, A., et al.: Wavenet: a generative model for raw audio. arXiv preprint arXiv:1609.03499 (2016)

5. Salza, P.L., Foti, E., Nebbia, L., Oreglia, M.: MOS and pair comparison combined methods for quality evaluation of text-to-speech systems. Acta Acust. United Acust. 82(4), 650–656 (1996)

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Unsupervised Emotion Matching for Image and Text Input;2024 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation (IATMSI);2024-03-14

2. TabNet to Identify Risks in Chronic Kidney Disease Using GAN's Synthetic Data;2022 2nd International Conference on Technological Advancements in Computational Sciences (ICTACS);2022-10-10

3. Multilevel Ensemble Method to Identify Risks in Chronic Kidney Disease Using Hybrid Synthetic Data;2022 13th International Conference on Computing Communication and Networking Technologies (ICCCNT);2022-10-03

4. The Human Takes It All: Humanlike Synthesized Voices Are Perceived as Less Eerie and More Likable. Evidence From a Subjective Ratings Study;Frontiers in Neurorobotics;2020-12-16