Affiliation:
1. School of Humanities, Jiangxi University of Chinese Medicine, Nanchang, 330004 Jiangxi, China
Abstract
Error detection and accuracy estimation in automated speech recognition (ASR) systems act a vital part in the design of human-computer spoken dialogue systems, as recognition error can hamper accurate systems in understanding the end user intentions. The major aim is to identify the errors in an utterance, and therefore, the dialogue manager can provide proper clarifications to the user. Therefore, the design of accurate error detection and accuracy determination techniques becomes essential in the ASR system. With this motivation, this paper presents a novel artificial intelligence-enabled accuracy estimation and error detection technique for the English speech recognition system (AIEDAE-ESRS). The goal of the AIEDAE-ESRS technique is to perform three actions such as confidence estimation, out-of-vocabulary (OOV) word identification, and error type categorization. In addition, the AIEDAE-ESRS technique performs different levels of preprocessing including sampling of input speech signal, bandpass filtering, and noise removal. Besides, a new deep neural network with hidden Markov model- (DNN-HMM-) based speech recognition technology is designed, which also aims to estimate the accuracy and error. Finally, the hyperparameters of the DNN-HMM model can be optimally chosen by the use of flower pollination algorithm (FPA) and thereby accomplished improved recognition performance. In order to demonstrate the better performance of the AIEDAE-ESRS technique, a series of simulations were conducted and the results are examined under varying aspects. English voice recognition system’s accuracy estimation and error detection were made possible using artificial intelligence (AIEDAE-ESRS). There are three steps in the AIEDAE-ESRS method: confidence estimation; identifying out-of-vocabulary words (OOV); and categorizing mistake types. The simulation results reported the enhanced performance of the AIEDAE-ESRS methodology over current advanced approaches. Our AIEDAE-ESRS methodology outperforms existing methodologies by a factor of ten. The simulation results demonstrated that the AIEDAE-ESRS methodology outperformed previous approaches in terms of efficiency. The improved experimental results indicated that the AIEDAE-ESRS technique produced superior results across a variety of measures.
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Information Systems
Reference23 articles.
1. ASR error detection using recurrent neural network language model and complementary ASR;Y. C. Tam
2. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups
3. RNNLM–recurrent neural network language modeling toolkit;T. Mikolov
4. Finding consensus in speech recognition: Word error minimization and other applications of confusion networks;L. L. Mangu
5. Discriminative training of hierarchical acoustic models for large vocabulary continuous speech recognition;H.-A. Chang