Evaluating machine learning architectures for sound event detection for signals with variable signal-to-noise-ratios in the Beaufort Sea-Reference-Cited by-同舟云学术

Evaluating machine learning architectures for sound event detection for signals with variable signal-to-noise-ratios in the Beaufort Sea

Published:2023-10-01 Issue:4 Volume:154 Page:2689-2707
ISSN:0001-4966
Container-title:The Journal of the Acoustical Society of America
language:en
Short-container-title:

Author:

Ibrahim Malek¹^ORCID,Sagers Jason D.²,Ballard Megan S.²^ORCID,Le Minh²,Koutsomitopoulos Vasilis³

Affiliation:

1. Department of Mechanical Engineering, Massachusetts Institute of Technology 1 , Cambridge, Massachusetts 02139, USA

2. Applied Research Laboratories, University of Texas at Austin 2 , Austin, Texas 78758, USA

3. Department of Computer Science, University of Texas at Austin 3 , Austin, Texas 78712, USA

Abstract

This paper explores the challenging polyphonic sound event detection problem using machine learning architectures applied to data recorded in the Beaufort Sea during the Canada Basin Acoustic Propagation Experiment. Four candidate architectures were investigated and evaluated on nine classes of signals broadcast from moored sources that were recorded on a vertical line array of hydrophones over the course of the yearlong experiment. These signals represent a high degree of variability with respect to time-frequency characteristics, changes in signal-to-noise ratio (SNR) associated with varying signal levels as well as fluctuating ambient sound levels, and variable distributions, which resulted in class imbalances. Within this context, binary relevance, which decomposes the multi-label learning task into a number of independent binary learning tasks, was examined as an alternative to the conventional multi-label classification (MLC) approach. Binary relevance has several advantages, including flexible, lightweight model configurations that support faster model inference. In the experiments presented, binary relevance outperformed conventional MLC approach on classes with the most imbalance and lowest SNR. A deeper investigation of model performance as a function of SNR showed that binary relevance significantly improved recall within the low SNR range for all classes studied.

Funder

Office of Naval Research

Publisher

Acoustical Society of America (ASA)

Subject

Acoustics and Ultrasonics,Arts and Humanities (miscellaneous)

Link

https://pubs.aip.org/asa/jasa/article-pdf/154/4/2689/18185288/2689_1_10.0021974.pdf

Reference88 articles.

1. A review of deep learning based methods for acoustic scene classification;Appl. Sci.,2020

2. Machine learning based approach for the interpretation of engineering geophysical sounding logs;Acta Geod. Geophys.,2021

3. Automated classification of bird and amphibian calls using machine learning: A comparison of methods;Ecol. Inf.,2009

4. Sound event detection using spatial features and convolutional recurrent neural network,2017

5. Sound event localization and detection of overlapping sources using convolutional recurrent neural networks;IEEE J. Sel. Top. Signal Process.,2019