Non-Autoregressive End-to-End Neural Modeling for Automatic Pronunciation Error Detection-Reference-Cited by-同舟云学术

Non-Autoregressive End-to-End Neural Modeling for Automatic Pronunciation Error Detection

Published:2022-12-22 Issue:1 Volume:13 Page:109
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Wadud Md. Anwar Hussen^ORCID,Alatiyyah Mohammed^ORCID,Mridha M. F.^ORCID

Abstract

A crucial element of computer-assisted pronunciation training systems (CAPT) is the mispronunciation detection and diagnostic (MDD) technique. The provided transcriptions can act as a teacher when evaluating the pronunciation quality of finite speech. The preceding texts have been entirely employed by conventional approaches, such as forced alignment and extended recognition networks, for model development or for enhancing system performance. The incorporation of earlier texts into model training has recently been attempted using end-to-end (E2E)-based approaches, and preliminary results indicate efficacy. Attention-based end-to-end models have shown lower speech recognition performance because multi-pass left-to-right forward computation constrains their practical applicability in beam search. In addition, end-to-end neural approaches are typically data-hungry, and a lack of non-native training data will frequently impair their effectiveness in MDD. To solve this problem, we provide a unique MDD technique that uses non-autoregressive (NAR) end-to-end neural models to greatly reduce estimation time while maintaining accuracy levels similar to traditional E2E neural models. In contrast, NAR models can generate parallel token sequences by accepting parallel inputs instead of left-to-right forward computation. To further enhance the effectiveness of MDD, we develop and construct a pronunciation model superimposed on our approach’s NAR end-to-end models. To test the effectiveness of our strategy against some of the best end-to-end models, we use publicly accessible L2-ARCTIC and SpeechOcean English datasets for training and testing purposes where the proposed model shows the best results than other existing models.

Funder

Deanship of Scientific Research at Prince Sattam Bin 365 Abdulaziz University

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/13/1/109/pdf

Reference54 articles.

1. Mispronunciation detection and diagnosis in l2 english speech using multidistribution deep neural networks;Li;IEEE/ACM Trans. Audio Speech Lang. Process.,2016

2. A review of tools and techniques for computer aided pronunciation training (CAPT) in English;Agarwal;Educ. Inf. Technol.,2019

3. Lo, W.K., Zhang, S., and Meng, H. (2010, January 26–30). Automatic derivation of phonological rules for mispronunciation detection in a computer-assisted pronunciation training system. Proceedings of the Eleventh Annual Conference of the International Speech Communication Association, Makuhari, Japan.

4. Harrison, A.M., Lo, W.K., Qian, X.J., and Meng, H. (2009, January 3–5). Implementation of an extended recognition network for mispronunciation detection and diagnosis in computer-assisted pronunciation training. Proceedings of the International Workshop on Speech and Language Technology in Education, Warwickshire, UK.

5. Qian, X., Soong, F.K., and Meng, H. (2010, January 26–30). Discriminative acoustic model for improving mispronunciation detection and diagnosis in computer-aided pronunciation training (CAPT). Proceedings of the Eleventh Annual Conference of the International Speech Communication Association, Makuhari, Japan.

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. 2D Spectrogram analysis using vision transformer to detect mispronounced Arabic utterances for children;Applied Soft Computing;2024-11

2. Evaluation of English Pronunciation Interaction Quality Based on Deep Learning;2024 International Conference on Integrated Circuits and Communication Systems (ICICACS);2024-02-23

3. Improving Healthcare Efficiency via Sensor-Based Remote Monitoring of Patient Health Utilizing an Enhanced AdaBoost Algorithm;Studies in Big Data;2024

4. Marine Animal Classification Using Deep Learning and Convolutional Neural Networks (CNN);2023 26th International Conference on Computer and Information Technology (ICCIT);2023-12-13

5. English Pronunciation Error Detection Method Based on Multiple Model Fusion;2023 International Conference on Network, Multimedia and Information Technology (NMITCON);2023-09-01