Tts4pretrain 2.0: Advancing the use of Text and Speech in ASR Pretraining with Consistency and Contrastive Losses-Reference-Cited by-同舟云学术

Tts4pretrain 2.0: Advancing the use of Text and Speech in ASR Pretraining with Consistency and Contrastive Losses

Published:2022-05-23 Issue: Volume: Page:
ISSN:
Container-title:ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
language:
Short-container-title:

Author:

Chen Zhehuai¹,Zhang Yu¹,Rosenberg Andrew¹,Ramabhadran Bhuvana¹,Moreno Pedro¹,Wang Gary¹

Affiliation:

1. Google, Inc.

Publisher

IEEE

Link

Reference37 articles.

1. Hierarchical generative modeling for controllable speech synthesis;hsu,2018

2. Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition;zhang,2020

5. Group normalization;wu;Proceedings of the European Conference on Computer Vision (ECCV),2018

Cited by 11 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Retrieval Augmented End-to-End Spoken Dialog Models;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14

2. Universal Cross-Lingual Data Generation for Low Resource ASR;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2024

3. Can Unpaired Textual Data Replace Synthetic Speech in ASR Model Adaptation?;2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU);2023-12-16