Affiliation:
1. Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
2. University of Oulu, Finland
Abstract
In this article, to utilize long-term dynamics over an isolated sign sequence, we propose a covariance matrix--based representation to naturally fuse information from multimodal sources. To tackle the drawback induced by the commonly used Riemannian metric, the proximity of covariance matrices is measured on the Grassmann manifold. However, the inherent Grassmann metric cannot be directly applied to the covariance matrix. We solve this problem by evaluating and selecting the most significant singular vectors of covariance matrices of sign sequences. The resulting compact representation is called the
Grassmann covariance matrix
. Finally, the Grassmann metric is used to be a kernel for the support vector machine, which enables learning of the signs in a discriminative manner. To validate the proposed method, we collect three challenging sign language datasets, on which comprehensive evaluations show that the proposed method outperforms the state-of-the-art methods both in accuracy and computational cost.
Funder
Microsoft Research Asia and the Natural Science Foundation of China
Infotech Oulu
Academy of Finland
Fidipro Program of Tekes
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Science Applications,Human-Computer Interaction
Reference53 articles.
1. Geometric Means in a Novel Vector Space Structure on Symmetric Positive‐Definite Matrices
2. A multi modal approach to gesture recognition from audio and video data
3. LIBSVM
4. China-Deaf-Assoc. 2003. Chinese Sign Language (in Chinese). Huaxia Publishing House. ISBN: 9787508030050 China-Deaf-Assoc. 2003. Chinese Sign Language (in Chinese). Huaxia Publishing House. ISBN: 9787508030050
Cited by
52 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献