An amalgamation of integrated features with DeepSpeech2 architecture and improved spell corrector for improving Gujarati language ASR system-Reference-Cited by-同舟云学术

An amalgamation of integrated features with DeepSpeech2 architecture and improved spell corrector for improving Gujarati language ASR system

Published:2024-02-13 Issue:1 Volume:27 Page:87-99
ISSN:1381-2416
Container-title:International Journal of Speech Technology
language:en
Short-container-title:Int J Speech Technol

Author:

Dua Mohit,Bhagat Bhavesh,Dua Shelza

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s10772-024-10082-z.pdf

Reference28 articles.

1. Amodei, D. et al. (2016). Deep Speech 2 : End-to-end speech recognition in English and Mandarin. In Proceedings of the 33rd international conference on machine learning, 2016, (vol. 48, pp. 173–182). Retrieved from https://proceedings.mlr.press/v48/amodei16.html

2. Anoop, C. S., & Ramakrishnan, A. G. (2021, July). CTC-based end-to-end ASR for the low resource Sanskrit language with spectrogram augmentation. In 2021 National conference on communications (NCC) (pp. 1–6). IEEE.

3. Bhogale, K., Raman, A., Javed, T., Doddapaneni, S., Kunchukuttan, A., Kumar, P., & Khapra, M. M. (2023, June). Effectiveness of mining audio and text pairs from public data for improving ASR systems for low-resource languages. In ICASSP 2023–2023 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 1–5). IEEE.

4. Billa, J. (2018). ISI ASR system for the low resource speech recognition challenge for Indian languages. In INTERSPEECH, 2018.

5. Cho, K., et al. (2014). Learning phrase representations using RNN encoder–decoder for statistical machine translation In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), Oct 2014, pp. 1724–1734. https://doi.org/10.3115/v1/D14-1179.