AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation-Reference-Cited by-同舟云学术

AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation

Published:2022-12-11 Issue: Volume: Page:
ISSN:
Container-title:2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP)
language:
Short-container-title:

Author:

Song Kun¹,Xue Heyang²,Wang Xinsheng¹,Cong Jian¹,Zhang Yongmao¹,Xie Lei¹,Yang Bing³,Zhang Xiong³,Su Dan³

Affiliation:

1. Northwestern Polytechnical University,School of Computer Science,Xi’an,China

2. Northwestern Polytechnical University,School of Software,Xi’an,China

3. Cloud and Smart Industries Group, Tencent Technology Co., Ltd.,China

Publisher

IEEE

Link

http://xplorestaging.ieee.org/ielx7/10037756/10037573/10037585.pdf?arnumber=10037585

Reference25 articles.

1. Lightspeech: Lightweight and Fast Text to Speech with Neural Architecture Search

2. Distilling the knowledge in a neural network;hinton;CoRR,2015

3. Transformers are rnns: Fast autoregressive transformers with linear attention;katharopoulos;Proceedings of the 37th International Conference on Machine Learning ICML 2020 13-18 July 2020 Virtual Event ser Proceedings of Machine Learning Research,2020

4. Spoken Content and Voice Factorization for Few-Shot Speaker Adaptation

5. Fastspeech 2: Fast and high-quality end-to-end text to speech;ren;9th International Conference on Learning Representations ICLR 2021,2021

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. High-Fidelity Neural Phonetic Posteriorgrams;2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW);2024-04-14

2. Variable-Length Speaker Conditioning in Flow-Based Text-to-Speech;IEEE Signal Processing Letters;2024

3. VITS: Quality Vs. Speed Analysis;Text, Speech, and Dialogue;2023