Affiliation:
1. Department of Computer Science, Brown University, 115 Waterman Street, Providence, RI 02912-1910, USA
2. Computer Vision Lab, EPFL CH-1015 Lausanne, Switzerland
Abstract
There is currently a division between real-world human performance and the decision making of socially interactive robots. This circumstance is partially due to the difficulty in estimating human cues, such as pose and gesture, from robot sensing. Towards bridging this division, we present a method for kinematic pose estimation and action recognition from monocular robot vision through the use of dynamical human motion vocabularies. Our notion of a motion vocabulary is comprised of movement primitives that structure a human's action space for decision making and predict human movement dynamics. Through prediction, such primitives can be used to both generate motor commands for specific actions and perceive humans performing those actions. In this paper, we focus specifically on the perception of human pose and performed actions using a known vocabulary of primitives. Given image observations over time, each primitive infers pose independently using its expected dynamics in the context of a particle filter. Pose estimates from a set of primitives inferencing in parallel are arbitrated to estimate the action being performed. The efficacy of our approach is demonstrated through interactive-time pose and action recognition over extended motion trials. Results evidence our approach requires small numbers of particles for tracking, is robust to unsegmented multi-action movement, movement speed, camera viewpoint and is able to recover from occlusions.
Publisher
World Scientific Pub Co Pte Lt
Subject
Artificial Intelligence,Mechanical Engineering
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献