Affiliation:
1. School of Chinese Language and Literature, Soochow University, Suzhou, 215000, China
Abstract
The development of intelligent technology has also made rapid progress in relevant speech fields. In order to increase the application scenarios of speech recognition systems, the research has improved the traditional Speech enhancement algorithm, namely the Ideal Binary Mask (IBM) algorithm, and combined it with the unimproved IBM algorithm to propose an adaptive IBM algorithm. Based on this algorithm, the research has built a new speech recognition system, The system uses an FIR filter to realize pre-emphasis processing and uses Berouti spectral subtraction to preprocess speech. The Speech enhancement model is built using a deep learning network model. The results showed that the IBM algorithm had the highest score in the Perceptual Evaluation of Speech Quality (PESQ) at 3.5596, followed by the Ideal Ratio Mask (IRM) algorithm at 3.3429. The improvement of the IBM algorithm was feasible when the noise intensity coefficient was greater than 0.008. When the noise intensity coefficient was greater than 0.08, the average score of the improved IBM algorithm was 2.1079, and the average score of the unimproved IBM algorithm was 1.9418. The proposed adaptive IBM algorithm has higher performance in complex speech environments compared to the original system.
Publisher
Association for Computing Machinery (ACM)