A framework for generating large-scale microphone array data for machine learning-Reference-Cited by-同舟云学术

A framework for generating large-scale microphone array data for machine learning

Published:2023-09-25 Issue: Volume: Page:
ISSN:1380-7501
Container-title:Multimedia Tools and Applications
language:en
Short-container-title:Multimed Tools Appl

Author:

Kujawski Adam^ORCID,Pelling Art J. R.,Jekosch Simon,Sarradj Ennes

Abstract

AbstractThe use of machine learning for localization of sound sources from microphone array data has increased rapidly in recent years. Newly developed methods are of great value for hearing aids, speech technologies, smart home systems or engineering acoustics. The existence of openly available data is crucial for the comparability and development of new data-driven methods. However, the literature review reveals a lack of openly available datasets, especially for large microphone arrays. This contribution introduces a framework for generation of acoustic data for machine learning. It implements tools for the reproducible random sampling of virtual measurement scenarios. The framework allows computations on multiple machines, which significantly speeds up the process of data generation. Using the framework, an example of a development dataset for sound source characterization with a 64-channel array is given. A containerized environment running the simulation source code is openly available. The presented approach enables the user to calculate large datasets, to store only the features necessary for training, and to share the source code which is needed to reproduce datasets instead of sharing the data itself. This avoids the problem of distributing large datasets and enables reproducible research.

Funder

Deutsche Forschungsgemeinschaft

Publisher

Springer Science and Business Media LLC

Subject

Computer Networks and Communications,Hardware and Architecture,Media Technology,Software

Link

https://link.springer.com/content/pdf/10.1007/s11042-023-16947-w.pdf

Reference61 articles.

1. Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mané D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viégas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X (2015) TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org (Last viewed September 5, 2022). https://www.tensorflow.org/

2. Adavanne S, Politis A, Virtanen T (2019) A multi-room reverberant dataset for sound event localization and detection. In: Proceedings of the detection and classification of acoustic scenes and events workshop (DCASE Workshop). New York, NY

3. Bianco MJ, Gerstoft P, Traer J, Ozanich E, Roch MA, Gannot S, Deledalle CA (2019) Machine learning in acoustics: Theory and applications. J. Acoust. Soc. Am. 146(5):3590–3628. https://doi.org/10.1121/1.5133944

4. Brousmiche M, Rouat J (2020) SECL-UMons database for sound event classification and localization. In: Proceedings of the ICASSP, pp 756–760. IEEE, May 4-8, Barcelona, Spain . https://doi.org/10.1109/ICASSP40776.2020.9053298

5. Cardenas Cabada E, Leclere Q, Antoni J, Hamzaoui N (2017) Fault detection in rotating machines with beamforming: Spatial visualization of diagnosis features. Mech. Syst. Signal Process. 97:33–43. https://doi.org/10.1016/j.ymssp.2017.04.018

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Three-dimensional grid-free sound source localization method based on deep learning;Applied Acoustics;2025-01

2. MIRACLE—a microphone array impulse response dataset for acoustic learning;EURASIP Journal on Audio, Speech, and Music Processing;2024-06-18