Author:
Orken Mamyrbayev,Dina Oralbekova,Keylan Alimhan,Tolganay Turdalykyzy,Mohamed Othman
Abstract
AbstractToday, the Transformer model, which allows parallelization and also has its own internal attention, has been widely used in the field of speech recognition. The great advantage of this architecture is the fast learning speed, and the lack of sequential operation, as with recurrent neural networks. In this work, Transformer models and an end-to-end model based on connectionist temporal classification were considered to build a system for automatic recognition of Kazakh speech. It is known that Kazakh is part of a number of agglutinative languages and has limited data for implementing speech recognition systems. Some studies have shown that the Transformer model improves system performance for low-resource languages. Based on our experiments, it was revealed that the joint use of Transformer and connectionist temporal classification models contributed to improving the performance of the Kazakh speech recognition system and with an integrated language model it showed the best character error rate 3.7% on a clean dataset.
Publisher
Springer Science and Business Media LLC
Reference28 articles.
1. Seide, G. L., & Yu, D. Conversational Speech. Transcription Using Context-Dependent Deep Neural. Networks. Interspeech (2011).
2. Bourlard, H., & Morgan, N. Connectionist speech recognition: A hybrid approach. p. 352 (1993) https://doi.org/10.1007/978-1-4615-3210-1.
3. Smit, P., Virpioja, S. & Kurimo, M. Advances in subword-based HMM-DNN speech recognition across languages. Comput. Speech Lang. 66, 1. https://doi.org/10.1016/j.csl.2020.101158 (2021).
4. Wang, D., Wang, X. & Lv, S. An overview of end-to-end automatic speech recognition. Symmetry 11, 1018. https://doi.org/10.3390/sym11081018 (2019).
5. Mamyrbayev, O. & Oralbekova, D. Modern trends in the development of speech recognition systems. News of the National academy of sciences of the republic of Kazakhstan 4(332), 42–51 (2020).
Cited by
21 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献