Author:
Chen Long,Chen Guitong,Huang Lei,Choy Yat-Sze,Sun Weize
Abstract
Synchronistical localization, separation, and reconstruction for multiple sound sources are usually necessary in various situations, such as in conference rooms, living rooms, and supermarkets. To improve the intelligibility of speech signals, the application of deep neural networks (DNNs) has achieved considerable success in the area of time-domain signal separation and reconstruction. In this paper, we propose a hybrid microphone array signal processing approach for the nearfield scenario that combines the beamforming technique and DNN. Using this method, the challenge of identifying both the sound source location and content can be overcome. Moreover, the use of a sequenced virtual sound field reconstruction process enables the proposed approach to be quite suitable for a sound field which contains a dominant, stronger sound source and masked, weaker sound sources. Using this strategy, all traceable, mainly sound, sources can be discovered by loops in a given sound field. The operational duration and accuracy of localization are further improved by substituting the broadband weighted multiple signal classification (BW-MUSIC) method for the conventional delay-and-sum (DAS) beamforming algorithm. The effectiveness of the proposed method for localizing and reconstructing speech signals was validated by simulations and experiments with promising results. The localization results were accurate, while the similarity and correlation between the reconstructed and original signals was high.
Funder
National Natural Science Foundation of China
Natural Science Foundation of Guangdong
Foundation of Shenzhen
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献