DNN-Based Full-Band Speech Synthesis Using GMM Approximation of Spectral Envelope-Reference-Cited by-同舟云学术

DNN-Based Full-Band Speech Synthesis Using GMM Approximation of Spectral Envelope

Published:2020-12-01 Issue:12 Volume:E103.D Page:2673-2681
ISSN:0916-8532
Container-title:IEICE Transactions on Information and Systems
language:en
Short-container-title:IEICE Trans. Inf. & Syst.

Author:

KOGUCHI Junya¹,TAKAMICHI Shinnosuke²,MORISE Masanori¹,SARUWATARI Hiroshi²,SAGAYAMA Shigeki³

Affiliation:

1. Meiji University

2. The University of Tokyo

3. The University of Electro-Communications

Publisher

Institute of Electronics, Information and Communications Engineers (IEICE)

Subject

Artificial Intelligence,Electrical and Electronic Engineering,Computer Vision and Pattern Recognition,Hardware and Architecture,Software

Link

https://www.jstage.jst.go.jp/article/transinf/E103.D/12/E103.D_2020EDP7075/_pdf

Reference25 articles.

1. [1] Y. Sagisaka, K. Takeda, M. Abe, S. Katagiri, T. Umeda, and H. Kuawhara, “A large-scale Japanese speech database,” ICSLP90, Kobe, Japan, pp.1089-1092, Nov. 1990.

2. [2] Y. Wang, R.J.S.-Ryan, D. Stanton, Y. Wu, R.J. Weiss, N. Jaitly, Z. Yang, Y. Xiao, Z. Chen, S. Bengio, Q. Le, Y. Agiomyrgiannakis, R. Clark, and R.A. Saurous, “Tacotron: Towards end-to-end speech synthesis,” Proc. INTERSPEECH, Stockholm, Sweden, pp.4006-4010, Aug. 2017. 10.21437/interspeech.2017-1452

3. [3] H. Zen, K. Tokuda, and A. Black, “Statistical parametric speech synthesis,” Speech Communication, vol.51, no.11, pp.1039-1064, 2009. 10.1016/j.specom.2009.04.004

4. [4] K. Tokuda, T. Kobayashi, T. Masuko, and S. Imai, “Mel-generalized cepstral analysis-a unified approach to speech spectral estimation,” Proc. ICSLP, Yokohama, Japan, pp.410-415, Sept. 1994.

5. [5] P. Zolfaghari and T. Robinson, “Formant analysis using mixtures of gaussians,” Proc. ICSLP, vol.2, pp.1229-1232, 1996. 10.1109/icslp.1996.607830

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Study about Chinese Speech Synthesis Algorithm and Acoustic Model Based on Wireless Communication Network;Wireless Communications and Mobile Computing;2021-10-04

2. Full-Band LPCNet: A Real-Time Neural Vocoder for 48 kHz Audio With a CPU;IEEE Access;2021