Automatic Speech Recognition Performance Improvement for Mandarin Based on Optimizing Gain Control Strategy-Reference-Cited by-同舟云学术

Automatic Speech Recognition Performance Improvement for Mandarin Based on Optimizing Gain Control Strategy

Published:2022-04-15 Issue:8 Volume:22 Page:3027
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Wang Desheng^ORCID,Wei Yangjie^ORCID,Zhang Ke,Ji Dong^ORCID,Wang Yi

Abstract

Automatic speech recognition (ASR) is an essential technique of human–computer interactions; gain control is a commonly used operation in ASR. However, inappropriate gain control strategies can lead to an increase in the word error rate (WER) of ASR. As there is a current lack of sufficient theoretical analyses and proof of the relationship between gain control and WER, various unconstrained gain control strategies have been adopted on realistic ASR systems, and the optimal gain control with respect to the lowest WER, is rarely achieved. A gain control strategy named maximized original signal transmission (MOST) is proposed in this study to minimize the adverse impact of gain control on ASR systems. First, by modeling the gain control strategy, the quantitative relationship between the gain control strategy and the ASR performance was established using the noise figure index. Second, through an analysis of the quantitative relationship, an optimal MOST gain control strategy with minimal performance degradation was theoretically deduced. Finally, comprehensive comparative experiments on a Mandarin dataset show that the proposed MOST gain control strategy can significantly reduce the WER of the experimental ASR system, with a 10% mean absolute WER reduction at −9 dB gain.

Funder

National Natural Science Foundation of China

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/22/8/3027/pdf

Reference46 articles.

1. Future Challenges in the Next Generation of Voice User Interface

2. Robust voice user interface for internet-of-things

3. Multimodal Corpus Design for Audio-Visual Speech Recognition in Vehicle Cabin

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Decoding and Analysing Consumer Feedback for Companies and Goods using Machine Learning;2023 Third International Conference on Artificial Intelligence and Smart Energy (ICAIS);2023-02-02

2. Non-Autoregressive End-to-End Neural Modeling for Automatic Pronunciation Error Detection;Applied Sciences;2022-12-22

3. Use Brain-Like Audio Features to Improve Speech Recognition Performance;Journal of Sensors;2022-09-19