A Neural Beamspace-Domain Filter for Real-Time Multi-Channel Speech Enhancement-Reference-Cited by-同舟云学术

A Neural Beamspace-Domain Filter for Real-Time Multi-Channel Speech Enhancement

Published:2022-05-24 Issue:6 Volume:14 Page:1081
ISSN:2073-8994
Container-title:Symmetry
language:en
Short-container-title:Symmetry

Author:

Liu Wenzhe,Li Andong,Wang Xiao,Yuan Minmin,Chen Yi,Zheng Chengshi,Li Xiaodong

Abstract

Most deep-learning-based multi-channel speech enhancement methods focus on designing a set of beamforming coefficients, to directly filter the low signal-to-noise ratio signals received by microphones, which hinders the performance of these approaches. To handle these problems, this paper designs a causal neural filter that fully exploits the spectro-temporal-spatial information in the beamspace domain. Specifically, multiple beams are designed to steer towards all directions, using a parameterized super-directive beamformer in the first stage. After that, a deep-learning-based filter is learned by, simultaneously, modeling the spectro-temporal-spatial discriminability of the speech and the interference, so as to extract the desired speech, coarsely, in the second stage. Finally, to further suppress the interference components, especially at low frequencies, a residual estimation module is adopted, to refine the output of the second stage. Experimental results demonstrate that the proposed approach outperforms many state-of-the-art (SOTA) multi-channel methods, on the generated multi-channel speech dataset based on the DNS-Challenge dataset.

Funder

National Environmental Protection Engineering and Technology Center for Road Traffic Noise Control

Publisher

MDPI AG

Subject

Physics and Astronomy (miscellaneous),General Mathematics,Chemistry (miscellaneous),Computer Science (miscellaneous)

Link

https://www.mdpi.com/2073-8994/14/6/1081/pdf

Reference42 articles.

1. Supervised Speech Separation Based on Deep Learning: An Overview

2. Speech Enhancement;Benesty,2005

3. Blind Speech Separation;Makino,2007

4. Multichannel Speech Enhancement by Raw Waveform-Mapping Using Fully Convolutional Networks

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Deep beamforming for speech enhancement and speaker localization with an array response-aware loss function;Frontiers in Signal Processing;2024-09-10

2. SPATIALCODEC: Neural Spatial Speech Coding;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14

3. TaBE: Decoupling spatial and spectral processing with Taylor’s unfolding method in the beamspace domain for multi-channel speech enhancement;Information Fusion;2024-01

4. Real-Time Multichannel Speech Separation and Enhancement Using a Beamspace-Domain-Based Lightweight CNN;ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2023-06-04

5. Improved Speech Spatial Covariance Matrix Estimation for Online Multi-Microphone Speech Enhancement;Sensors;2022-12-22