SyncLabeling: A Synchronized Audio Segmentation Interface for Mobile Devices-Reference-Cited by-同舟云学术

SyncLabeling: A Synchronized Audio Segmentation Interface for Mobile Devices

Published:2023-09-11 Issue:MHCI Volume:7 Page:1-19
ISSN:2573-0142
Container-title:Proceedings of the ACM on Human-Computer Interaction
language:en
Short-container-title:Proc. ACM Hum.-Comput. Interact.

Author:

Tang Yi¹^ORCID,Chang Chia-Ming²^ORCID,Yang Xi³^ORCID,Igarashi Takeo²^ORCID

Affiliation:

1. Jilin University, Jilin, China

2. The University of Tokyo, Tokyo, Japan

3. Jilin University & Engineering Research Center of Knowledge-Driven Human-Machine Intelligence, Changchun, Jilin, China

Abstract

Manual audio segmentation is a time-consuming process, especially when there is more than one sound playing simultaneously that needs to be segmented and annotated (e.g., target and background sounds). In conventional audio annotation interfaces, users need to repeatedly pause and replay the audio to complete an overlap segmentation task, which is very inefficient. In this paper, we propose "SyncLabeling," a synchronized audio segmentation interface for smartphones that allows users to segment and annotate two overlapping sounds in a single audio stream at a time using a game-like labeling interface on mobile devices. We conducted a user study to compare the proposed SyncLabeling interface with a conventional audio annotation interface on four types of audio segmentation tasks. The results showed that the proposed interface is much more efficient than the conventional interface (2.4× faster) under comparable annotation accuracy in most tasks. In addition, more than half of the participants enjoyed using the proposed SyncLabeling interface and showed willingness to use it.

Funder

Jilin University

JST CREST

JST ACT-X

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Human-Computer Interaction,Social Sciences (miscellaneous)

Link

https://dl.acm.org/doi/pdf/10.1145/3604273

Reference50 articles.

1. Human Benchmark. [n. d.]. Human Benchmark. https://humanbenchmark.com/tests/reactiontime Human Benchmark. [n. d.]. Human Benchmark. https://humanbenchmark.com/tests/reactiontime

2. Nicholas J Bryan and Gautham J Mysore . 2013 . Interactive user-feedback for sound source separation . In International Conference on Intelligent User-Interfaces (IUI), Workshop on Interactive Machine Learning. Santa Monica. Nicholas J Bryan and Gautham J Mysore. 2013. Interactive user-feedback for sound source separation. In International Conference on Intelligent User-Interfaces (IUI), Workshop on Interactive Machine Learning. Santa Monica.

3. Overlap-Aware Diarization: Resegmentation Using Neural End-to-End Overlapped Speech Detection

4. Chris Cannam Christian Landone Mark B Sandler and Juan Pablo Bello. 2006. The Sonic Visualiser: A Visualisation Platform for Semantic Descriptors from Musical Signals.. In ISMIR. 324--327. Chris Cannam Christian Landone Mark B Sandler and Juan Pablo Bello. 2006. The Sonic Visualiser: A Visualisation Platform for Semantic Descriptors from Musical Signals.. In ISMIR. 324--327.

5. Unleashing the killer corpus: experiences in creating the multi-everything AMI Meeting Corpus

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. PDFChatAnnotator: A Human-LLM Collaborative Multi-Modal Data Annotation Tool for PDF-Format Catalogs;Proceedings of the 29th International Conference on Intelligent User Interfaces;2024-03-18