Speech Separation Using Convolutional Neural Network and Attention Mechanism-Reference-Cited by-同舟云学术

Speech Separation Using Convolutional Neural Network and Attention Mechanism

Published:2020-07-25 Issue: Volume:2020 Page:1-10
ISSN:1026-0226
Container-title:Discrete Dynamics in Nature and Society
language:en
Short-container-title:Discrete Dynamics in Nature and Society

Author:

Yuan Chun-Miao¹,Sun Xue-Mei¹^ORCID,Zhao Hu¹

Affiliation:

1. School of Computer Science and Technology, TianGong University, Tianjin 300387, China

Abstract

Speech information is the most important means of human communication, and it is crucial to separate the target voice from the mixed sound signals. This paper proposes a speech separation model based on convolutional neural networks and attention mechanism. The magnitude spectrum of the mixed speech signals, as the input, has its high dimensionality. By analyzing the characteristics of the convolutional neural network and attention mechanism, it can be found that the convolutional neural network can effectively extract low-dimensional features and mine the spatiotemporal structure information in the speech signals, and the attention mechanism can reduce the loss of sequence information. The accuracy of speech separation can be improved effectively by combining two mechanisms. Compared to the typical speech separation model DRNN-2 + discrim, this method achieves 0.27 dB GNSDR gain and 0.51 dB GSIR gain, which illustrates that the speech separation model proposed in this paper has achieved an ideal separation effect.

Funder

Program for the Science and Technology Plans of Tianjin, China

Publisher

Hindawi Limited

Subject

Modeling and Simulation

Link

http://downloads.hindawi.com/journals/ddns/2020/2196893.pdf

Reference26 articles.

1. Exploring Vibrato-Motivated Acoustic Features for Singer Identification

2. Support vector machine active learning for music retrieval

3. Suppression of acoustic noise in speech using spectral subtraction