Affiliation:
1. Hunan First Normal University , Changsha , Hunan , , China .
Abstract
Abstract
With the continuous development of computer technology and the deepening of the popularization of Mandarin, the role of computer technology in language-assisted language learning and recognition has become more and more significant. In this study, the acoustic model and speech model based on the Hidden Markov Model are constructed for the error detection of reading aloud in Mandarin. Then the feature extraction of the acoustics of speech signals is carried out to build the error detection model of reading aloud in Mandarin based on pronunciation features. On this basis, a DNNHMM hybrid model is built by combining deep neural networks and Hidden Markov Models for detecting Mandarin read aloud keyword errors. Through the empirical analysis of Mandarin reading error detection, it can be seen that the average accuracy of the Mandarin reading error detection model constructed in this paper is 92.37%. Compared with the other models, the average accuracy improvement ranges from 4.69% to 8.19%, and the average accuracy of the vowel and consonant pronunciation features is 85.04% and 81.69%, respectively. In the detection of errors in Mandarin reading aloud, the F-score of misreading, back-reading, adding, changing, omitting, and other six error types is above 80%, and the accuracy rate is above 83%. It shows that the deep learning-based model for reading aloud in Mandarin performs well and provides an effective method for error detection in reading aloud in Mandarin.
Reference18 articles.
1. Chen, C. (2021). An analysis of mandarin emotional tendency recognition based on expression spatiotemporal feature recognition. International Journal of Biometrics(2/3), 13.
2. Lin, Y. B., Liao, Y. F., Chen, S. H., Hwang, S. H., & Wang, Y. R. (2023). Voicetalk: multimedia-iot applications for mixing mandarin, taiwanese, and english. ACM Transactions on Internet Technology(2), 23.
3. Wang, X., & Zhao, C. (2021). A 2d convolutional gating mechanism for mandarin streaming speech recognition. Information (Switzerland), 12(4), 165.
4. ANA PELLICER–SáNCHEZ, Conklin, K., Rodgers, M. P. H., & Parente, F. (2021). The effect of auditory input on multimodal reading comprehension: an examination of adult readers’ eye movements. The Modern Language Journal, 105(4), 936–956.
5. Li, W., & Yang, J. (2017). Using eif in dnn and wfst framework for large vocabulary continuous mandarin chinese recognition. C e Ca, 42(3), 1083–1087.