Affiliation:
1. Department of Electronics and Information Engineering, Huazhong University of Science and Technology, Wuhan 430074, P. R. China
2. Amazon, Seattle, WA 98105, USA
3. Department of Computer and Information Sciences, Temple University, PA 19122, USA
Abstract
Indoor environment is a common scene in our everyday life, and detecting and tracking multiple targets in this environment is a key component for many applications. However, this task still remains challenging due to limited space, intrinsic target appearance variation, e.g. full or partial occlusion, large pose deformation, and scale change. In the proposed approach, we give a novel framework for detection and tracking in indoor environments, and extend it to robot navigation. One of the key components of our approach is a virtual top view created from an RGB-D camera, which is named ground plane projection (GPP). The key advantage of using GPP is the fact that the intrinsic target appearance variation and extrinsic noise is far less likely to appear in GPP than in a regular side-view image. Moreover, it is a very simple task to determine free space in GPP without any appearance learning even from a moving camera. Hence GPP is very different from the top-view image obtained from a ceiling mounted camera. We perform both object detection and tracking in GPP. Two kinds of GPP images are utilized: gray GPP, which represents the maximal height of 3D points projecting to each pixel, and binary GPP, which is obtained by thresholding the gray GPP. For detection, a simple connected component labeling is used to detect footprints of targets in binary GPP. For tracking, a novel Pixel Level Association (PLA) strategy is proposed to link the same target in consecutive frames in gray GPP. It utilizes optical flow in gray GPP, which to our best knowledge has never been done before. Then we "back project" the detected and tracked objects in GPP to original, side-view (RGB) images. Hence we are able to detect and track objects in the side-view (RGB) images. Our system is able to robustly detect and track multiple moving targets in real time. The detection process does not rely on any target model, which means we do not need any training process. Moreover, tracking does not require any manual initialization, since all entering objects are robustly detected. We also extend the novel framework to robot navigation by tracking. As our experimental results demonstrate, our approach can achieve near prefect detection and tracking results. The performance gain in comparison to state-of-the-art trackers is most significant in the presence of occlusion and background clutter.
Publisher
World Scientific Pub Co Pte Lt
Subject
Artificial Intelligence,Computer Vision and Pattern Recognition,Software
Cited by
16 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献