Affiliation:
1. Sarnoff Corporation, 201 Washington Road, Princeton, NJ 08536, USA
Abstract
This paper presents an approach to extract semantic layers from aerial surveillance videos for scene understanding and object tracking. The input videos are captured by low flying aerial platforms and typically consist of strong parallax from non-ground-plane structures as well as moving objects. Our approach leverages the geo-registration between video frames and reference images (such as those available from Terraserver and Google satellite imagery) to establish a unique geo-spatial coordinate system for pixels in the video. The geo-registration process enables Euclidean 3D reconstruction with absolute scale unlike traditional monocular structure from motion where continuous scale estimation over long periods of time is an issue. Geo-registration also enables correlation of video data to other stored information sources such as GIS (Geo-spatial Information System) databases. In addition to the geo-registration and 3D reconstruction aspects, the other key contributions of this paper also include: (1) providing a reliable geo-based solution to estimate camera pose for 3D reconstruction, (2) exploiting appearance and 3D shape constraints derived from geo-registered videos for labeling of structures such as buildings, foliage, and roads for scene understanding, and (3) elimination of moving object detection and tracking errors using 3D parallax constraints and semantic labels derived from geo-registered videos. Experimental results on extended time aerial video data demonstrates the qualitative and quantitative aspects of our work.
Publisher
World Scientific Pub Co Pte Lt
Subject
Artificial Intelligence,Computer Vision and Pattern Recognition,Software
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献