Automatic Spatial Audio Scene Classification in Binaural Recordings of Music-Reference-Cited by-同舟云学术

Automatic Spatial Audio Scene Classification in Binaural Recordings of Music

Published:2019-04-26 Issue:9 Volume:9 Page:1724
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Zieliński Sławomir K.^ORCID,Lee Hyunkook^ORCID

Abstract

The aim of the study was to develop a method for automatic classification of the three spatial audio scenes, differing in horizontal distribution of foreground and background audio content around a listener in binaurally rendered recordings of music. For the purpose of the study, audio recordings were synthesized using thirteen sets of binaural-room-impulse-responses (BRIRs), representing room acoustics of both semi-anechoic and reverberant venues. Head movements were not considered in the study. The proposed method was assumption-free with regards to the number and characteristics of the audio sources. A least absolute shrinkage and selection operator was employed as a classifier. According to the results, it is possible to automatically identify the spatial scenes using a combination of binaural and spectro-temporal features. The method exhibits a satisfactory classification accuracy when it is trained and then tested on different stimuli but synthesized using the same BRIRs (accuracy ranging from 74% to 98%), even in highly reverberant conditions. However, the generalizability of the method needs to be further improved. This study demonstrates that in addition to the binaural cues, the Mel-frequency cepstral coefficients constitute an important carrier of spatial information, imperative for the classification of spatial audio scenes.

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/9/9/1724/pdf

Reference60 articles.

1. YouTube Live-Streams in Virtual Reality and Adds 3D Sound, BBC News http://www.bbc.com/news/technology-36073009

2. Binaural Audio at the BBC Proms https://www.bbc.co.uk/rd/blog/2016-09-binaural-proms

3. Omnitone: Spatial Audio on the Web, Google, USA https://opensource.googleblog.com/2016/07/omnitone-spatial-audio-on-web.html

4. The Technology of Binaural Listening;Blauert,2013

5. Spatial quality evaluation for reproduced sound: Terminology, meaning, and a scene-based paradigm;Rumsey;J. Audio Eng. Soc.,2002

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Acoustic Scene Classification using Deep Fisher network;Digital Signal Processing;2023-07

2. Spatial Audio Coding and Machine Learning;Encyclopedia of Data Science and Machine Learning;2022-10-14

3. A Preliminary Investigation on Frequency Dependant Cues for Human Emotions;Acoustics;2022-05-22

4. Spatial Audio Scene Characterization (SASC): Automatic Localization of Front-, Back-, Up-, and Down-Positioned Music Ensembles in Binaural Recordings;Applied Sciences;2022-02-01

5. Automatic discrimination between front and back ensemble locations in HRTF-convolved binaural recordings of music;EURASIP Journal on Audio, Speech, and Music Processing;2022-01-15