Abstract
Automated environmental sound recognition has clear engineering benefits; it allows audio to be sorted, curated, and searched. Unlike music and language, environmental sound is loaded with noise and lacks the rhythm and melody of music or the semantic sequence of language, making it difficult to find common features representative enough of various environmental sound signals. To improve the accuracy of environmental sound recognition, this paper proposes a recognition method based on multi-feature parameters and time–frequency attention module. It begins with a pretreatment that relies on multi-feature parameters to extract the sound, which supplements the phase information lost by the Log-Mel spectrogram in the current mainstream methods, and enhances the expressive ability of input features. A time–frequency attention module with multiple convolutions is designed to extract the attention weight of the input feature spectrogram and reduce the interference coming from the background noise and irrelevant frequency bands in the audio. Comparative experiments were conducted on three general datasets: environmental sound classification datasets (ESC-10, ESC-50) and an UrbanSound8K dataset. Experiments demonstrated that the proposed method performs better.
Funder
the National Science Foundation of China
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference27 articles.
1. Sound identification for fire-fighting mobile robots;Baum;Proceedings of the IEEE International Conference on Robotic Computing,2018
2. Robust Environmental Sound Recognition for Home Automation
3. Audio analysis for surveillance applications;Radhakrishnan;Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics,2005
4. Spectral vs. spectro-temporal features for acoustic event detection;Cotton;Proceedings of the 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA),2011
5. A Lung Sound Category Recognition Method Based on Wavelet Decomposition and BP Neural Network
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献