Affiliation:
1. Center of Intelligent Acoustics and Immersive Communications, School of Marine Science and Technology, Northwestern Polytechnical University, 127 West Youyi Road, Beilin District, Xi’an 710072, China
Abstract
Ambisonic room impulse responses (ARIRs) are recorded to capture the spatial acoustic characteristics of specific rooms, with widespread applications in virtual and augmented reality. While the first-order Ambisonics (FOA) microphone array is commonly employed for three-dimensional (3D) room acoustics recording due to its easy accessibility, higher spatial resolution necessitates using higher-order Ambisonics (HOA) in applications such as binaural rendering and sound field reconstruction. This paper introduces a novel approach, leveraging generative models to upmix ARIRs. The evaluation results validate the model’s effectiveness at upmixing first-order ARIRs to higher-order representations, surpassing the aliasing frequency limitations. Furthermore, the spectral errors observed in the Binaural Room Transfer Functions (BRTFs) indicate the potential benefits of using upmixed ARIRs for binaural rendering, significantly improving rendering accuracy.
Funder
National Natural Science Foundation of China
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference35 articles.
1. Periphony: With-Height Sound Reproduction;Gerzon;J. Audio Eng. Soc.,1973
2. Gerzon, M.A. (1980). Audio Engineering Society Convention 65, Audio Engineering Society.
3. Zotter, F., and Frank, M. (2019). Ambisonics: A Practical 3D Audio Theory for Recording, Studio Production, Sound Reinforcement, and Virtual Reality, Springer.
4. Gerzon, M.A. (1975). Audio Engineering Society Convention 50, Audio Engineering Society.
5. 3-D Sound Spatialization using Ambisonic Techniques;Malham;Comput. Music J.,1995