Abstract
The state-of-the-art results of single-channel speech enhancement were recently obtained by applying the unpaired dataset CycleGAN network approach, which is comparable to the paired dataset neural network approach. As only a relatively small amount of noisy speech data is usually available in applications, an augmented, semi-supervised CycleGAN is proposed. Recently, the feature map regularized CycleGAN approach was proposed and applied to the image transfer task, obtaining significant improvements on several standard image domain transfer databases. In this paper, we use a feature map regularized CycleGAN and combine it with the augmented semi-supervised approach in order to further improve CycleGAN Speech enhancement. Significant improvements in the speech enhancement task by means of several standard measures are obtained by using the proposed approach in comparison to baseline CycleGAN as well as the augmented CycleGAN approach.
Publisher
Centre for Evaluation in Education and Science (CEON/CEES)