A Human-in-the-Loop System for Sound Event Detection and Annotation-Reference-Cited by-同舟云学术

A Human-in-the-Loop System for Sound Event Detection and Annotation

Published:2018-07-14 Issue:2 Volume:8 Page:1-23
ISSN:2160-6455
Container-title:ACM Transactions on Interactive Intelligent Systems
language:en
Short-container-title:ACM Trans. Interact. Intell. Syst.

Author:

Kim Bongjun¹,Pardo Bryan¹

Affiliation:

1. Northwestern University, USA

Abstract

Labeling of audio events is essential for many tasks. However, finding sound events and labeling them within a long audio file is tedious and time-consuming. In cases where there is very little labeled data (e.g., a single labeled example), it is often not feasible to train an automatic labeler because many techniques (e.g., deep learning) require a large number of human-labeled training examples. Also, fully automated labeling may not show sufficient agreement with human labeling for many uses. To solve this issue, we present a human-in-the-loop sound labeling system that helps a user quickly label target sound events in a long audio. It lets a user reduce the time required to label a long audio file (e.g., 20 hours) containing target sounds that are sparsely distributed throughout the recording (10% or less of the audio contains the target) when there are too few labeled examples (e.g., one) to train a state-of-the-art machine audio labeling system. To evaluate the effectiveness of our tool, we performed a human-subject study. The results show that it helped participants label target sound events twice as fast as labeling them manually. In addition to measuring the overall performance of the proposed system, we also measure interaction overhead and machine accuracy, which are two key factors that determine the overall performance. The analysis shows that an ideal interface that does not have interaction overhead at all could speed labeling by as much as a factor of four.

Funder

National Science Foundation

Publisher

Association for Computing Machinery (ACM)

Subject

Artificial Intelligence,Human-Computer Interaction

Link

https://dl.acm.org/doi/pdf/10.1145/3214366

Reference39 articles.

1. Examining multiple potential models in end-user interactive concept learning

2. CueT

3. Audio brush

Cited by 27 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Designing controllers for hand tremor suppression using model simplification;Biomedical Signal Processing and Control;2024-10

2. Unpacking Human-AI interactions: From Interaction Primitives to a Design Space;ACM Transactions on Interactive Intelligent Systems;2024-08-02

3. Deep Active Audio Feature Learning in Resource-Constrained Environments;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2024

4. Automatic Multiple Sounds Detection with Recurrent Neural Networks (LSTM);Lecture Notes in Networks and Systems;2024

5. Selective Annotation of Few Data for Beat Tracking of Latin American Music Using Rhythmic Features;Transactions of the International Society for Music Information Retrieval;2024