Teacher-Student Framework for Polyphonic Semi-supervised Sound Event Detection: Survey and Empirical Analysis-Reference-Cited by-同舟云学术

Teacher-Student Framework for Polyphonic Semi-supervised Sound Event Detection: Survey and Empirical Analysis

Published:2024-04-23 Issue: Volume: Page:
ISSN:2157-6904
Container-title:ACM Transactions on Intelligent Systems and Technology
language:en
Short-container-title:ACM Trans. Intell. Syst. Technol.

Author:

Diffallah Zhor¹^ORCID,Ykhlef Hadjer¹^ORCID,Bouarfa Hafida¹^ORCID

Affiliation:

1. University of Blida 1, Algeria

Abstract

Polyphonic sound event detection refers to the task of automatically identifying sound events occurring simultaneously in an auditory scene. Due to the inherent complexity and variability of real-world auditory scenes, building robust detectors for polyphonic sound event detection poses a significant challenge. The task becomes further more challenging without sufficient annotated data to develop sound event detection systems under a supervised learning regime. In this paper, we explore the recent developments in polyphonic sound event detection, with a particular emphasis on the application of Teacher-Student techniques within the semi-supervised learning paradigm. Unlike previous works, we have consolidated and organized the fragmented literature on Teacher-Student techniques for polyphonic sound event detection. By examining the latest research, categorizing Teacher-Student approaches, and conducting an empirical study to assess the performance of each approach, this survey offers valuable insights and practical guidance for researchers and practitioners in the field. Our findings highlight the potential benefits of utilizing multiple learners, ensuring consistent predictions, and making thoughtful choices regarding perturbation strategies.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3660641

Reference161 articles.

1. Fault Detection and Classification Based on Co-training of Semisupervised Machine Learning

2. A Review of Deep Learning Based Methods for Acoustic Scene Classification

3. Sharath Adavanne, Giambattista Parascandolo, Pasi Pertilä, Toni Heittola, and Tuomas Virtanen. 2017. Sound Event Detection in Multichannel Audio Using Spatial and Harmonic Features. CoRR abs/1706.02293 (2017). http://arxiv.org/abs/1706.02293

4. Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks

5. Sharath Adavanne, Archontis Politis, and Tuomas Virtanen. 2019. A multi-room reverberant dataset for sound event localization and detection. CoRR abs/1905.08546 (2019). http://arxiv.org/abs/1905.08546