Fast Griffin Lim based waveform generation strategy for text-to-speech synthesis-Reference-Cited by-同舟云学术

Fast Griffin Lim based waveform generation strategy for text-to-speech synthesis

Published:2020-08-15 Issue:41-42 Volume:79 Page:30205-30233
ISSN:1380-7501
Container-title:Multimedia Tools and Applications
language:en
Short-container-title:Multimed Tools Appl

Author:

Sharma Ankit,Kumar Puneet,Maddukuri Vikas^ORCID,Madamshetti Nagasai,Kishore K. G.,Kavuru Sahit Sai Sriram,Raman Balasubramanian,Roy Partha Pratim

Publisher

Springer Science and Business Media LLC

Subject

Computer Networks and Communications,Hardware and Architecture,Media Technology,Software

Link

https://link.springer.com/content/pdf/10.1007/s11042-020-09321-7.pdf

Reference47 articles.

1. Aaron A, Bakis R, Eide EM, Hamza WM (2014) Systems and methods for text-to-speech synthesis using spoken example, November 11 2014. US Patent 8,886,538

2. Arik SO, Chrzanowski M, Coates A, Diamos G, Gibiansky A, Kang Y, Li X, Miller J, Ng A, Raiman J, et al. (2017) Deep voice: real-time neural text-to-speech. In: Proceedings of the 34th international conference on machine learning (ICML), vol 70, pp 195–204

3. Arik SO, Jun H, Diamos G (2018) Fast spectrogram inversion using multi-head convolutional neural networks. IEEE Signal Process Lett 26(1):94–98

4. Bracewell RN, Bracewell RN (1986) The Fourier transform and its applications, vol 31999. McGraw-Hill, New York

5. Braunschweiler N, Gales MJF, Buchholz S (2010) Lightly supervised recognition for automatic alignment of large coherent speech recordings. In: INTERSPEECH

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Enhancing spoken dialect identification with stacked generalization of deep learning models;Multimedia Tools and Applications;2024-09-04

2. AI Enabled Avionics Domain Specific TTS System;2023 6th International Conference on Recent Trends in Advance Computing (ICRTAC);2023-12-14

3. Black-Box Watermarking and Blockchain for IP Protection of Voiceprint Recognition Model;Electronics;2023-09-01

4. Deep Iterative Phase Retrieval for Ptychography;ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2022-05-23

5. Text-to-speech with linear spectrogram prediction for quality and speed improvement;Phonetics and Speech Sciences;2021-09