Abstract
The article notes the complexity and duration of the process of forming a training sample of a neural network, since the correctness of the sample is checked by assessing the quality of the network after its training, it also notes the negative impact of the commonly used formal method of forming a training sample of a neural network without taking into account the physical processes of data and signal transformation in real devices on the quality of the network when filtering the noise of the speech signal. Methods and means for filtering the noise of speech signals are described. To solve the filtering problem, a sequence of main stages of processing a speech signal containing noise is presented, and their description is given. The article proposes to choose a filtering method based on the analysis of noise characteristics, while it is recommended to distinguish between homogeneous (monotonic) and dynamically changing (random) noise, for which filtering methods are different. When choosing a filtering method, it is proposed to take into account the degree of correspondence between the frequency range of the noise and the speech signal. As the main way to reduce noise, an approach is proposed based on the use of an improved and proven method for filtering noise by subtracting the spectral components of noise from the spectrum of a signal containing noise. This approach is proposed to be used for the formation and correction of a training set for a neural network designed to reduce noise in a speech signal. The results of the practical application of the proven filtration method confirmed the feasibility of its application. An important result of the work presented in the article is the possibility of evaluating the feasibility of specific corrective changes in the neural network training set by comparing it with the filtering results of the modified and tested method.
Reference17 articles.
1. Pang J., The 7th IEEE Annual Computing and Communication Workshop and Conference (CCWC) (2017) https://doi.org/10.1109/CCWC.2017.7868454
2. Formant-Based Robust Voice Activity Detection
3. Zhang X., Wu J., International Conference on Acoustics, Speech and Signal Processing, 853–857 (2013) https://doi.org/10.1109/ICASSP.2013.6637769
4. Wang Q., Du J., Bao X. et al, 16th Annual Conference of the International Speech Communication Association, 2282–2286 (2015)
5. Kinoshita K., Ochiai T., Delcroix M., Nakatani T., Improving Noise Robust Automatic Speech Recognition With Single-Channel Time-Domain Enhancement Network (NTT Corporation, Kyoto, 2020) https://arxiv.org/pdf/2003.03998.pdf
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献