Affiliation:
1. School of Architecture, Rensselaer Polytechnic Institute, Troy, New York 12180, USA
Abstract
Deep learning is one established tool for carrying out classification tasks on complex, multi-dimensional data. Since audio recordings contain a frequency and temporal component, long-term monitoring of bioacoustics recordings is made more feasible with these computational frameworks. Unfortunately, these neural networks are rarely designed for the task of open set classification in which examples belonging to the training classes must not only be correctly classified but also crucially separated from any spurious or unknown classes. To combat this reliance on closed set classifiers which are singularly inappropriate for monitoring applications in which many non-relevant sounds are likely to be encountered, the performance of several open set classification frameworks is compared on environmental audio datasets recorded and published within this work, containing both biological and anthropogenic sounds. The inference-based open set classification techniques include prediction score thresholding, distance-based thresholding, and OpenMax. Each open set classification technique is evaluated under multi-, single-, and cross-corpus scenarios for two different types of unknown data, configured to highlight common challenges inherent to real-world classification tasks. The performance of each method is highly dependent upon the degree of similarity between the training, testing, and unknown domain.
Funder
National Science Foundation
RPI HASS Fellowship
Publisher
Acoustical Society of America (ASA)
Subject
Acoustics and Ultrasonics,Arts and Humanities (miscellaneous)
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献