Abstract
Multiple sound source separation in a reverberant environment has become popular in recent years. To improve the quality of the separated signal in a reverberant environment, a separation method based on a DOA cue and a deep neural network (DNN) is proposed in this paper. Firstly, a pre-processing model based on non-negative matrix factorization (NMF) is utilized for recorded signal dereverberation, which makes source separation more efficient. Then, we propose a multi-source separation algorithm combining sparse and non-sparse component points recovery to obtain each sound source signal from the dereverberated signal. For sparse component points, the dominant sound source for each sparse component point is determined by a DOA cue. For non-sparse component points, a DNN is used to recover each sound source signal. Finally, the signals separated from the sparse and non-sparse component points are well matched by temporal correlation to obtain each sound source signal. Both objective and subjective evaluation results indicate that compared with the existing method, the proposed separation approach shows a better performance in the case of a high-reverberation environment.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference36 articles.
1. End-to-end attention-based large vocabulary speech recognition;Bahdanau;Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),2016
2. Deep complementary bottleneck features for visual speech recognition;Petridis;Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),2016
3. Binary Sparse Coding of Convolutive Mixtures for Sound Localization and Separation via Spatialization
4. Non-negative hidden Markov modeling of audio with application to source separation;Mysore;Proceedings of the 9th International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA’10),2010
5. Single channel speech separation and recognition using loopy belief propagation;Rennie;Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing,2009