Speaker Adaptive Text-to-Speech With Timbre-Normalized Vector-Quantized Feature-Reference-Cited by-同舟云学术

Speaker Adaptive Text-to-Speech With Timbre-Normalized Vector-Quantized Feature

Published:2023 Issue: Volume:31 Page:3446-3456
ISSN:2329-9290
Container-title:IEEE/ACM Transactions on Audio, Speech, and Language Processing
language:
Short-container-title:IEEE/ACM Trans. Audio Speech Lang. Process.

Author:

Du Chenpeng¹^ORCID,Guo Yiwei¹^ORCID,Chen Xie¹^ORCID,Yu Kai¹^ORCID

Affiliation:

1. X-LANCE Lab, Department of Computer Science and Engineering, MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, Shanghai, China

Funder

Scientific and Technological Innovation 2030

Shanghai Municipal Science and Technology Major Project

Special Program of Suzhou Innovation and Entrepreneurship Leading Talents

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Subject

Electrical and Electronic Engineering,Acoustics and Ultrasonics,Computer Science (miscellaneous),Computational Mathematics

Link

http://xplorestaging.ieee.org/ielx7/6570655/9970249/10229489.pdf?arnumber=10229489

Reference58 articles.

1. Transfer learning from speaker verification to multispeaker text-to-speech synthesis;jia;Proc Int Adv Conf Neural Inf Process Syst,0

2. Seen and Unseen Emotional Style Transfer for Voice Conversion with A New Emotional Speech Dataset

3. The relationship between fundamental frequency variation and articulation in healthy speech production;behre,2017

4. Conditional variational autoencoder with adversarial learning for end-to-end text-to-speech;kim;Proc Int Conf Mach Learn,0

5. Virtual pitch and phase sensitivity of a computer model of the auditory periphery. II: Phase sensitivity

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Synthesis and Restoration of Traditional Ethnic Musical Instrument Timbres Based on Time-Frequency Analysis;Traitement du Signal;2024-04-30

2. SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14

3. Acoustic BPE for Speech Generation with Discrete Tokens;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14

4. VoiceFlow: Efficient Text-To-Speech with Rectified Flow Matching;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14

5. Text-to-Speech Conversation Using Optimized Deep Learning Model;Lecture Notes in Networks and Systems;2024