Author:
Dorfan Yuval,Schwartz Ofer,Gannot Sharon
Abstract
AbstractAd hoc acoustic networks comprising multiple nodes, each of which consists of several microphones, are addressed. From the ad hoc nature of the node constellation, microphone positions are unknown. Hence, typical tasks, such as localization, tracking, and beamforming, cannot be directly applied. To tackle this challenging joint multiple speaker localization and array calibration task, we propose a novel variant of the expectation-maximization (EM) algorithm. The coordinates of multiple arrays relative to an anchor array are blindly estimated using naturally uttered speech signals of multiple concurrent speakers. The speakers’ locations, relative to the anchor array, are also estimated. The inter-distances of the microphones in each array, as well their orientations, are assumed known, which is a reasonable assumption for many modern mobile devices (in outdoor and in a several indoor scenarios). The well-known initialization problem of the batch EM algorithm is circumvented by an incremental procedure, also derived here. The proposed algorithm is tested by an extensive simulation study.
Publisher
Springer Science and Business Media LLC
Subject
Electrical and Electronic Engineering,Acoustics and Ultrasonics
Reference78 articles.
1. G. Lathoud, J. -M. Odobez, D. Gatica-Perez, in International Workshop on Machine Learning for Multimodal Interaction. AV16.3: an audio-visual corpus for speaker localization and tracking (Springer, 2004), pp. 182–195. https://doi.org/10.1007/978-3-540-30568-2_16.
2. T. Yamada, S. Nakamura, K. Shikano, in Fourth IEEE International Conference on Spoken Language, vol. 3. Robust speech recognition with speaker localization by a microphone array, (1996), pp. 1317–1320. https://doi.org/10.1109/icslp.1996.607855.
3. S. Doclo, M. Moonen, Robust adaptive time delay estimation for speaker localization in noisy and reverberant acoustic environments. EURASIP J. Appl. Sig. Process.2003:, 1110–1124 (2003).
4. O. Schwartz, S. Gannot, Speaker tracking using recursive EM algorithms. IEEE/ACM Trans. Audio Speech Lang. Process.22(2), 392–402 (2014).
5. N. Madhu, R. Martin, in Proceedings of the International Workshop on Acoustic Echo Cancellation and Noise Control (IWAENC). A scalable framework for multiple speaker localization and tracking, (2008).
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献