Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method-Reference-Cited by-同舟云学术

Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method

Published:2019-08-18 Issue:16 Volume:9 Page:3396
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Wu Jianfeng,Hua Yongzhu,Yang Shengying,Qin Hongshuai,Qin Huibin

Abstract

This paper presents a new deep neural network (DNN)-based speech enhancement algorithm by integrating the distilled knowledge from the traditional statistical-based method. Unlike the other DNN-based methods, which usually train many different models on the same data and then average their predictions, or use a large number of noise types to enlarge the simulated noisy speech, the proposed method does not train a whole ensemble of models and does not require a mass of simulated noisy speech. It first trains a discriminator network and a generator network simultaneously using the adversarial learning method. Then, the discriminator network and generator network are re-trained by distilling knowledge from the statistical method, which is inspired by the knowledge distillation in a neural network. Finally, the generator network is fine-tuned using real noisy speech. Experiments on CHiME4 data sets demonstrate that the proposed method achieves a more robust performance than the compared DNN-based method in terms of perceptual speech quality.

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/9/16/3396/pdf

Reference16 articles.

1. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator

2. Spectral enhancement methods;Cohen,2008

3. Speech Enhancement: Theory and Practice;Loizou,2013

4. Improving Robustness of Codebook-Based Noise Estimation Approaches With Delta Codebooks

5. Speech enhancement using hidden Markov models in Mel-frequency domain

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A review of deep learning techniques for speech processing;Information Fusion;2023-11

2. A comprehensive review of generative adversarial networks: Fundamentals, applications, and challenges;WIREs Computational Statistics;2023-08-02

3. Comparing Classifiers for Recognizing the Emotions by extracting the Spectral Features of Speech Using Machine Learning;2023 International Conference on Device Intelligence, Computing and Communication Technologies, (DICCT);2023-03-17

4. ERIL: An Algorithm for Emotion Recognition from Indian Languages Using Machine Learning;Wireless Personal Communications;2022-09-12

5. BERIS: An mBERT-based Emotion Recognition Algorithm from Indian Speech;ACM Transactions on Asian and Low-Resource Language Information Processing;2022-04-29