Affiliation:
1. CEIEC, Universidad Francisco de Vitoria, Ctra. Pozuelo-Majadahonda km. 1800, 28223 Madrid, Spain
Abstract
This paper presents an approach to enhancing the clarity and intelligibility of speech in digital communications compromised by various background noises. Utilizing deep learning techniques, specifically a Variational Autoencoder (VAE) with 2D convolutional filters, we aim to suppress background noise in audio signals. Our method focuses on four simulated environmental noise scenarios: storms, wind, traffic, and aircraft. The training dataset has been obtained from public sources (TED-LIUM 3 dataset, which includes audio recordings from the popular TED-TALK series) combined with these background noises. The audio signals were transformed into 2D power spectrograms, upon which our VAE model was trained to filter out the noise and reconstruct clean audio. Our results demonstrate that the model outperforms existing state-of-the-art solutions in noise suppression. Although differences in noise types were observed, it was challenging to definitively conclude which background noise most adversely affects speech quality. The results have been assessed with objective (mathematical metrics) and subjective (listening to a set of audios by humans) methods. Notably, wind noise showed the smallest deviation between the noisy and cleaned audio, perceived subjectively as the most improved scenario. Future work should involve refining the phase calculation of the cleaned audio and creating a more balanced dataset to minimize differences in audio quality across scenarios. Additionally, practical applications of the model in real-time streaming audio are envisaged. This research contributes significantly to the field of audio signal processing by offering a deep learning solution tailored to various noise conditions, enhancing digital communication quality.
Reference39 articles.
1. Zhang, H., and Wang, D. (September, January 30). A Deep Learning Approach to Multi-Channel and Multi-Microphone Acoustic Echo Cancellation. Proceedings of the Interspeech 2021, Brno, Czech Republic.
2. Proximal normalized subband adaptive filtering for acoustic echo cancellation;Guo;IEEE/ACM Trans. Audio Speech Lang. Process.,2021
3. Overview of image denoising based on deep learning;Liu;J. Phys. Conf. Ser.,2019
4. Bioacoustic signal denoising: A review;Zie;Artif. Intell. Rev.,2021
5. Deep learning;LeCun;Nature,2015
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Deep Learning Approaches for Enhanced Audio Quality Through Noise Reduction;2024 International Conference on Communication, Computer Sciences and Engineering (IC3SE);2024-05-09