Recognition of English speech – using a deep learning algorithm-Reference-Cited by-同舟云学术

Recognition of English speech – using a deep learning algorithm

Published:2023-01-01 Issue:1 Volume:32 Page:
ISSN:2191-026X
Container-title:Journal of Intelligent Systems
language:en
Short-container-title:

Author:

Wang Shuyan¹

Affiliation:

1. School of Foreign Languages, Zhengzhou University of Economics and Business , Zhengzhou , Henan 451191 , China

Abstract

Abstract The accurate recognition of speech is beneficial to the fields of machine translation and intelligent human–computer interaction. After briefly introducing speech recognition algorithms, this study proposed to recognize speech with a recurrent neural network (RNN) and adopted the connectionist temporal classification (CTC) algorithm to align input speech sequences and output text sequences forcibly. Simulation experiments compared the RNN-CTC algorithm with the Gaussian mixture model–hidden Markov model and convolutional neural network-CTC algorithms. The results demonstrated that the more training samples the speech recognition algorithm had, the higher the recognition accuracy of the trained algorithm was, but the training time consumption increased gradually; the more samples a trained speech recognition algorithm had to test, the lower the recognition accuracy and the longer the testing time. The proposed RNN-CTC speech recognition algorithm always had the highest accuracy and the lowest training and testing time among the three algorithms when the number of training and testing samples was the same.

Publisher

Walter de Gruyter GmbH

Subject

Artificial Intelligence,Information Systems,Software

Link

https://www.degruyter.com/document/doi/10.1515/jisys-2022-0236/pdf

Reference21 articles.

1. Li G, Liang S, Nie S, Liu W, Yang Z. Deep neural network-based generalized sidelobe canceller for dual-channel far-field speech recognition. Neural Netw. 2021;141:225–37.

2. Park J, Kim MJ, Lee HW, Min PS, Lee MY. A study on character tendency analysis using speech recognition and text augmentation algorithm - Focusing on the tendency of the leading actor in the movie. J Image Cultural Contents. 2021;22:43–65.

3. Hu G, Zhao Q. Multi-model fusion framework based on multi-input cross-language emotional speech recognition. Int J Wirel Mob Comput. 2021;20:32.

4. Fantaye TG, Yu JQ, Hailu TT. Investigation of automatic speech recognition systems via the multilingual deep neural network modeling methods for a very low-resource language, Chaha. Signal Inf Process. 2020;11:1–21.

5. Prasad BR. Classification of analyzed text in speech recognition using RNN-LSTM in comparison with convolutional neural network to improve precision for identification of keywords. Rev Gesto Inovao e Tecnologias. 2021;11:1097–108.

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Optimization and Application of Neural Network Machine Translation Model Based on Deep Learning;2024 International Conference on Electrical Drives, Power Electronics & Engineering (EDPEE);2024-02-27

2. Review of Deep Speech Recognizer using Transcriber;2023 6th International Conference on Advances in Science and Technology (ICAST);2023-12-08

3. English Pronunciation Quality Evaluation System Based on Continuous Speech Recognition Technology for Multi-Terminal;Journal of Physics: Conference Series;2023-11-01