Author:
Takeda Ryu, ,Komatani Kazunori
Abstract
[abstFig src='/00290001/03.jpg' width='300' text='Sound source localization and problem' ] We focus on the problem of localizing soft/weak voices recorded by small humanoid robots, such as NAO. Sound source localization (SSL) for such robots requires fast processing and noise robustness owing to the restricted resources and the internal noise close to the microphones. Multiple signal classification using generalized eigenvalue decomposition (GEVD-MUSIC) is a promising method for SSL. It achieves noise robustness by whitening robot internal noise using prior noise information. However, whitening increases the computational cost and creates a direction-dependent bias in the localization score, which degrades the localization accuracy. We have thus developed a new implementation of GEVD-MUSIC based on steering vector transformation (TSV-MUSIC). The application of a transformation equivalent to whitening to steering vectors in advance reduces the real-time computational cost of TSV-MUSIC. Moreover, normalization of the transformed vectors cancels the direction-dependent bias and improves the localization accuracy. Experiments using simulated data showed that TSV-MUSIC had the highest accuracy of the methods tested. An experiment using real recoded data showed that TSV-MUSIC outperformed GEVD-MUSIC and other MUSIC methods in terms of localization by about 4 points under low signal-to-noise-ratio conditions.
Publisher
Fuji Technology Press Ltd.
Subject
Electrical and Electronic Engineering,General Computer Science
Reference21 articles.
1. K. Nakadai, T. Lourens, H. G. Okuno, and H. Kitano, “Active audition for humanoid,” Proc. of the Seventeenth National Conference on Artificial Intelligence, pp. 832-839, 2000.
2. D. Gouaillier, V. Hugel, P. Blazevic, C. Kilner, J. O. Monceaux, P. Lafourcade, B. Marnier, J. Serre, and B. Maisonnier, “Mechatronic design of Nao humanoid,” Proc. of IEEE Int. Conf. on Robotics and Automation, pp. 769-774, 2009.
3. T. Miyazaki, M. Mizumachi, and K. Niyada, “Acoustic analysis of breathy and rough voice characterizing elderly speech,” J. of Advanced Computational Intelligence and Intelligent Informatics, Vol.14, No.2, pp. 135-141, 2010.
4. K. Nakadai, H. G. Okuno, H. Nakajima, Y. Hasegawa, and H. Tsujino, “An open source software system for robot audition HARK and its evaluation,” Proc. of IEEE-RAS Int. Conf. on Humanoid Robots, pp. 561-566, 2008.
5. K. Nakamura, K. Nakadai, and G. Ince, “Real-time super-resolution sound source localization for robots,” Proc. of IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, pp. 694-699, 2012.
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献