Affiliation:
1. Beijing University of Technology
Abstract
The paper proposes a fast method that uses the lips color information in the Lab color space and the pre-knowledge of geometric characteristics around lip areas to extract the lip contour and visual speech features from color images or video sequences with front talking faces. In our method, the Adaboost algorithm is utilized to realize the face detection. Then, the mouth area is segmented based on the face shape attribution. According to the relative position of the trough and crest of the histogram, we can get an adaptive threshold. The A-component in the Lab color space was used to extract the outer lip and the L-component is used to extract the inner lip. From the contour image, we obtain the feature by searching twice the points of the contour. The experimental results show that obtained visual feature values in our approach are approximate to that with AAM algorithm but with less computation complexity.
Publisher
Trans Tech Publications, Ltd.
Reference8 articles.
1. G. Potamianos, C. Neti, G. Gravier, A. Garg, and A. Senior: Recent Advances in the Automatic Recognition of Audio-Visual Speech, In Proc. IEEE, 1306-1326 (2004).
2. J. -W. Kuo, H. -Y. Lo, and H. -M. Wang: Improved HMM/SVM methods for automatic phoneme segmentation, in Proc. Interspeech, Antwerp, Belgium, 2057-2060 (2007).
3. S. Gurbuz, Z. Tufekci, E. Patterson, and J. N. Gowdy: Application of affine-invariant Fourier descriptors to lipreading for audio-visual speech recognition, in Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 177–180(2001).
4. S. Dupont and J. Luettin: Audio-visual speech modeling for continuous speech recognition, IEEE Trans. Multimedia, Vol. 2, p.141–151 (2000).
5. A. V. Nefian, L. Liang, X. Pi, X. Liu, and K. Murphy: Dynamic Bayesian networks for audio-visual speech recognition, EURASIP J. Appl. Signal Process., p.1274–1288 (2002).