Improved capsule routing for weakly labeled sound event detection-Reference-Cited by-同舟云学术

Improved capsule routing for weakly labeled sound event detection

Published:2022-03-07 Issue:1 Volume:2022 Page:
ISSN:1687-4722
Container-title:EURASIP Journal on Audio, Speech, and Music Processing
language:en
Short-container-title:J AUDIO SPEECH MUSIC PROC.

Author:

Li Haitao,Yang Shuguo,Wang Wenwu

Abstract

AbstractPolyphonic sound event detection aims to detect the types of sound events that occur in given audio clips, and their onset and offset times, in which multiple sound events may occur simultaneously. Deep learning–based methods such as convolutional neural networks (CNN) achieved state-of-the-art results in polyphonic sound event detection. However, two open challenges still remain: overlap between events and prone to overfitting problem. To solve the above two problems, we proposed a capsule network-based method for polyphonic sound event detection. With so-called dynamic routing, capsule networks have the advantage of handling overlapping objects and the generalization ability to reduce overfitting. However, dynamic routing also greatly slows down the training process. In order to speed up the training process, we propose a weakly labeled polyphonic sound event detection model based on the improved capsule routing. Our proposed method is evaluated on task 4 of the DCASE 2017 challenge and compared with several baselines, demonstrating competitive results in terms of F-score and computational efficiency.

Publisher

Springer Science and Business Media LLC

Subject

Electrical and Electronic Engineering,Acoustics and Ultrasonics

Link

https://link.springer.com/content/pdf/10.1186/s13636-022-00239-6.pdf

Reference33 articles.

1. N. Cho, E.K. Kim, Enhanced voice activity detection using acoustic event detection and classification. IEEE Trans. Consum. Electron. 57, 196 (2011).

2. N. C. Phuong and T. Do Dat, Sound classification for event detection: Application into medical telemonitoring, 2013 Int. Conf. Comput. Manag. Telecommun. ComManTel 2013 330 (2013).

3. T.K. Chan, C.S. Chin, Health stages diagnostics of underwater thruster using sound features with imbalanced dataset. Neural Comput. Appl. 31, 5767 (2019).

4. Z. Zhao, S. Zhang, Z. Xu, K. Bellisario, N. Dai, H. Omrani, B.C. Pijanowski, Automated bird acoustic event detection and robust species classification. Ecol. Inform. 39, 99 (2017).

5. E. Cakir, G. Parascandolo, T. Heittola, H. Huttunen, T. Virtanen, Convolutional recurrent neural networks for polyphonic sound event detection. IEEE/ACM Trans. Audio, Speech and Lang. Proc. 25, 1291 (2017).

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Greedy regression and differential convex-based deep learning for audio event classification;Journal of Intelligent & Fuzzy Systems;2023-12-02