Comparison of real-time multi-speaker neural vocoders on CPUs-Reference-Cited by-同舟云学术

Comparison of real-time multi-speaker neural vocoders on CPUs

Published:2022-03-01 Issue:2 Volume:43 Page:121-124
ISSN:1346-3969
Container-title:Acoustical Science and Technology
language:en
Short-container-title:Acoust. Sci. & Tech.

Author:

Matsubara Keisuke¹,Okamoto Takuma²,Takashima Ryoichi¹,Takiguchi Tetsuya¹,Toda Tomoki²,Kawai Hisashi²

Affiliation:

1. Graduate School of System Informatics, Kobe University

2. National Institute of Information and Communications Technology

Publisher

Acoustical Society of Japan

Subject

Acoustics and Ultrasonics

Link

https://www.jstage.jst.go.jp/article/ast/43/2/43_E2161/_pdf

Reference16 articles.

1. 1) A. van den Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. Senior and K. Kavukcuoglu, "WaveNet: A generative model for raw audio," Proc. SSW9, p. 125 (2016).

2. 2) J. Shen, R. Pang, R. J. Weiss, M. Schuster, N. Jaitly, Z. Yang, Z. Chen, Y. Zhang, Y. Wang, R. J. Skerry-Ryan, R. A. Saurous, Y. Agiomyrgiannakis and Y. Wu, "Natural TTS synthesis by conditioning WavaNet on mel spectrogram predictions," Proc. ICASSP 2018, pp. 4779–4783 (2018).

3. 3) A. Tamamori, T. Hayashi, K. Kobayashi, K. Takeda and T. Toda, "Speaker-dependent WaveNet vocoder," Proc. Interspeech 2017, pp. 1118–1122 (2017).

4. 4) J. L.-Trueba, T. Drugman, J. Latorre, T. Merritt, B. Putrycz, R. B.-Chicote, A. Moinet and V. Aggarwal, "Towards achieving robust universal neural vocoding," Proc. Interspeech 2019, pp. 181–185 (2019).

5. 5) J. Kong, J. Kim and J. Bae, "HiFi-GAN: Generative adversarial networks for efficient and high fidelity speech synthesis," Proc. NeurIPS 2020, pp. 17022–17033 (2020).

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A review of differentiable digital signal processing for music and speech synthesis;Frontiers in Signal Processing;2024-01-11

2. PUFFIN: Pitch-Synchronous Neural Waveform Generation for Fullband Speech on Modest Devices;ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2023-06-04

3. Harmonic-Net: Fundamental Frequency and Speech Rate Controllable Fast Neural Vocoder;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2023