Affiliation:
1. School of Software Engineering, Tongji University, Shanghai 201804, China
Abstract
Binaural audio synthesis (BAS) aims to restore binaural audio from mono signals obtained from the environment to enhance users’ immersive experiences. It plays an essential role in building Augmented Reality and Virtual Reality environments. Existing deep neural network (DNN)-based BAS systems synthesize binaural audio by modeling the overall sound propagation processes from the source to the left and right ears, which encompass early decay, room reverberation, and head/ear-related filtering. However, this end-to-end modeling approach brings in the overfitting problem for BAS models when they are trained using a small and homogeneous data set. Moreover, existing losses cannot well supervise the training process. As a consequence, the accuracy of synthesized binaural audio is far from satisfactory on binaural differences. In this work, we propose a novel DNN-based BAS method, namely DIFFBAS, to improve the accuracy of synthesized binaural audio from the perspective of the interaural phase difference. Specifically, DIFFBAS is trained using the average signals of the left and right channels. To make the model learn the binaural differences, we propose a new loss named Interaural Phase Difference (IPD) loss to supervise the model training. Extensive experiments have been performed and the results demonstrate the effectiveness of the DIFFBAS model and the IPD loss.
Funder
the Fundamental Research Funds
Reference30 articles.
1. The sense of presence within auditory virtual environments;Hendrix;Presence Teleoper. Virtual Environ.,1996
2. Hammershøi, D., and Møller, H. (2005). Communication Acoustics, Springer.
3. Hoeg, E.R., Gerry, L.J., Thomsen, L., Nilsson, N.C., and Serafin, S. (2017, January 19). Binaural sound reduces reaction time in a virtual reality search task. Proceedings of the 2017 IEEE 3rd VR Workshop on Sonic Interactions for Virtual Environments (SIVE), Los Angeles, CA, USA.
4. Blauert, J., and Braasch, J. (2011, January 6–8). Binaural signal processing. Proceedings of the 2011 17th International Conference on Digital Signal Processing (DSP), Corfu, Greece.
5. Natural sound rendering for headphones: Integration of signal processing techniques;He;IEEE Signal Process. Mag.,2015