An overview of machine learning and other data-based methods for spatial audio capture, processing, and reproduction-Reference-Cited by-同舟云学术

An overview of machine learning and other data-based methods for spatial audio capture, processing, and reproduction

Published:2022-05-16 Issue:1 Volume:2022 Page:
ISSN:1687-4722
Container-title:EURASIP Journal on Audio, Speech, and Music Processing
language:en
Short-container-title:J AUDIO SPEECH MUSIC PROC.

Author:

Cobos Maximo,Ahrens Jens^ORCID,Kowalczyk Konrad,Politis Archontis

Abstract

AbstractThe domain of spatial audio comprises methods for capturing, processing, and reproducing audio content that contains spatial information. Data-based methods are those that operate directly on the spatial information carried by audio signals. This is in contrast to model-based methods, which impose spatial information from, for example, metadata like the intended position of a source onto signals that are otherwise free of spatial information. Signal processing has traditionally been at the core of spatial audio systems, and it continues to play a very important role. The irruption of deep learning in many closely related fields has put the focus on the potential of learning-based approaches for the development of data-based spatial audio applications. This article reviews the most important application domains of data-based spatial audio including well-established methods that employ conventional signal processing while paying special attention to the most recent achievements that make use of machine learning. Our review is organized based on the topology of the spatial audio pipeline that consist in capture, processing/manipulation, and reproduction. The literature on the three stages of the pipeline is discussed, as well as on the spatial audio representations that are used to transmit the content between them, highlighting the key references and elaborating on the underlying concepts. We reflect on the literature based on a juxtaposition of the prerequisites that made machine learning successful in domains other than spatial audio with those that are found in the domain of spatial audio as of today. Based on this, we identify routes that may facilitate future advancement.

Funder

national science centre

erdf

ministerio de ciencia, innovación y universidades

generalitat valenciana

Chalmers University of Technology

Publisher

Springer Science and Business Media LLC

Subject

Electrical and Electronic Engineering,Acoustics and Ultrasonics

Link

https://link.springer.com/content/pdf/10.1186/s13636-022-00242-x.pdf

Reference214 articles.

1. J. Y. Hong, J. He, B. Lam, R. Gupta, W. -S. Gan, Spatial audio for soundscape design: recording and reproduction. Appl. Sci.7(6) (2017). https://doi.org/10.3390/app7060627.

2. W. Zhang, P. N. Samarasinghe, H. Chen, T. D. Abhayapala, Surround by sound: a review of spatial audio recording and reproduction. Appl. Sci.7(5) (2017). https://doi.org/10.3390/app7050532.

3. F. Rumsey, Spatial quality evaluation for reproduced sound: terminology, meaning, and a scene-based paradigm. J. Audio Eng. Soc.50(9), 651–666 (2002).

4. J. Francombe, T. Brookes, R. Mason, Evaluation of spatial audio reproduction methods (part 1): elicitation of perceptual differences. J. Audio Eng. Soc.65(3), 198–211 (2017).

5. M. Cobos, J. J. Lopez, J. M. Navarro, G. Ramos, Subjective quality assessment of multichannel audio accompanied with video in representative broadcasting genres. Multimed. Syst.21(4), 363–379 (2015).

Cited by 20 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. 3D printing of biodegradable polymers and their composites – Current state-of-the-art, properties, applications, and machine learning for potential future applications;Progress in Materials Science;2024-12

2. Physics-constrained adaptive kernel interpolation for region-to-region acoustic transfer function: a Bayesian approach;EURASIP Journal on Audio, Speech, and Music Processing;2024-09-10

3. Digital human and embodied intelligence for sports science: advancements, opportunities and prospects;The Visual Computer;2024-06-21

4. MIRACLE—a microphone array impulse response dataset for acoustic learning;EURASIP Journal on Audio, Speech, and Music Processing;2024-06-18

5. Image Generation Using AI with Effective Audio Playback System;2024 5th International Conference for Emerging Technology (INCET);2024-05-24