Author:
Nakadai Kazuhiro, ,Tezuka Taiki,Yoshida Takami,
Abstract
[abstFig src='/00290001/11.jpg' width='300' text='Ego-noise suppression achieves speech recognition even during motion' ] This paper addresses ego-motion noise suppression for a robot. Many ego-motion noise suppression methods use motion information such as position, velocity, and the acceleration of each joint to infer ego-motion noise. However, such inferences are not reliable, since motion information and ego-motion noise are not always correlated. We propose a new framework for ego-motion noise suppression based on single channel processing using only acoustic signals captured with a microphone. In the proposed framework, ego-motion noise features and their numbers are automatically estimated in advance from an ego-motion noise input using Infinite Non-negative Matrix Factorization (INMF), which is a non-parametric Bayesian model that does not use explicit motion information. After that, the proposed Semi-Blind INMF (SB-INMF) is applied to an input signal that consists of both the target and ego-motion noise signals. Ego-motion noise features, which are obtained with INMF, are used as inputs to the SB-INMF, and are treated as the fixed features for extracting the target signal. Finally, the target signal is extracted with SB-INMF using these newly-estimated features. The proposed framework was applied to ego-motion noise suppression on two types of humanoid robots. Experimental results showed that ego-motion noise was effectively and efficiently suppressed in terms of both signal-to-noise ratio and performance of automatic speech recognition compared to a conventional template-based ego-motion noise suppression method using motion information. Thus, the proposed method worked properly on a robot without a motion information interface.**This work is an extension of our publication “Taiki Tezuka, Takami Yoshida, Kazuhiro Nakadai: Ego-motion noise suppression for robots based on Semi-Blind Infinite Non-negative Matrix Factorization, ICRA 2014, pp.6293-6298, 2014.”
Publisher
Fuji Technology Press Ltd.
Subject
Electrical and Electronic Engineering,General Computer Science
Reference30 articles.
1. K. Nakadai, T. Lourens, H. G. Okuno, and H. Kitano, “Active audition for humanoid,” Proc. of 17th National Conf. on Artificial Intelligence (AAAI-2000), pp. 832-839, 2000.
2. K. Nakadai, D. Matsuura, H. G. Okuno, and H. Tsujino, “Improvement of recognition of simultaneous speech signals using av integration and scattering theory for humanoid robots,” Speech Communication, Vol.44, pp. 97-112, 2004.
3. Y. Nishimura, M. Ishizuka, K. Nakadai, M. Nakano, and H. Tsujino, “Speech recognition for a humanoid with motor noise utilizing missing feature theory,” Proc. of 6th IEEE-RAS Int. Conf. on Humanoid Robots (Humanoids 2006), pp. 26-33, 2006.
4. T. Rodemann, M. Heckmann, F. Joublin, C. Goerick, and B. Schölling, “Real-time sound localization with a binaural head-system using a biologically-inspired cue-triple mapping,” Proc. of IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS-2006), pp. 860-865, 2006.
5. J. Hornstein, M. Lopes, J. Santos-Victor, and F. Lacerda, “Sound localization for humanoid robots – building audio-motor maps based on the hrtf,” Proc. of IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS-2006), pp. 1171-1176, 2006.