Building robust Korean speech recognition model by fine-tuning large pretrained model*-Reference-Cited by-同舟云学术

Building robust Korean speech recognition model by fine-tuning large pretrained model*

Published:2023-09 Issue:3 Volume:15 Page:75-82
ISSN:2005-8063
Container-title:Phonetics and Speech Sciences
language:en
Short-container-title:Phonetics Speech Sci.

Author:

Oh Changhan,Kim Cheongbin,Park Kiyoung

Funder

Development of Artificial Intelligence Technology

Publisher

The Korean Society of Speech Sciences

Subject

Materials Science (miscellaneous)

Link

http://www.eksss.org/download/download_pdf?doi=10.13064/KSSS.2023.15.3.075

Reference21 articles.

1. AiHub (2021). Aihub broadcast content korean speech recognition data. Retrieved from https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=463

2. Baevski, A., Zhou, Y., Mohamed, A., & Auli, M. (2020, December). wav2vec 2.0: A framework for self-supervised learning of speech representations. Proceedings of the Advances in Neural Information Processing Systems (pp. 12449-12460). Online Conference.

3. Bang, J. U., Yun, S., Kim, S. H., Choi, M. Y., Lee, M. K., Kim, Y. J., Kim, D. H., ... Kim, S. H. (2020). KsponSpeech: Korean spontaneous speech corpus for automatic speech recognition. Applied Sciences, 10(19), 6936. 10.3390/app10196936

4. Chan, W., Jaitly, N., Le, Q., & Vinyals, O. (2016, March). Listen, attend and spell: A neural network for large vocabulary conversational speech recognition. Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4960-4964). Shanghai, China. 10.1109/ICASSP.2016.7472621

5. Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020, July). A simple framework for contrastive learning of visual representations. Proceedings of the 37th International Conference on Machine Learning (pp. 1597-1607). Online Conference.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Enhanced AI Model to Improve Child Speech Recognition;Journal of Digital Contents Society;2024-02-28