Affiliation:
1. School of Visual Communication Design, LuXun Academy of Fine Arts, Shenyang 110000, Liaoning, China
Abstract
With environmental noise in the exhibition hall, speakers tend to change their speech production to preserve intelligible communication. While great evolution has been prepared in Automatic Speech Recognition (ASR), important performance deprivation occurs in a noisy environment. The assessment of the degree of speech impairment and the efficacy of computer recognition of impaired speech are distinctly and independently executed. Convolutional Neural Networks (CNN) have been effectively employed in speech recognition and computer vision tasks. Hence, this study uses the Deep Convolutional Neural Network-based Automatic Speech Recognition Model (DCNN-ASRM) for effective speech recognition in the noisy exhibition hall. This study configures the filter sizes, poolings, and input feature map. The filter size and pooling are decreased, and the dimension of the input feature is comprehensive to permit increasing convolution layers. Furthermore, an in-depth analysis of the proposed DCNN-ASRM model discloses critical features, like fast convergence speed, compact model scales, and noise robustness in speech recognition. The simulation analysis shows that the suggested DCNN-ASRM model enhances the recognition accuracy ratio of 98.1%, performance ratio of 97.2%, and noise reduction ratio of 96.5% and reduces the word error rate by 9.2% and signal-to-noise ratio by 10.3% compared to other existing models.
Subject
Computer Networks and Communications,Computer Science Applications
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献