Abstract
Visual object tracking in unmanned aerial vehicle (UAV) videos plays an important role in a variety of fields, such as traffic data collection, traffic monitoring, as well as film and television shooting. However, it is still challenging to track the target robustly in UAV vision task due to several factors such as appearance variation, background clutter, and severe occlusion. In this paper, we propose a novel two-stage UAV tracking framework, which includes a target detection stage based on multifeature discrimination and a bounding-box estimation stage based on the instance-aware attention network. In the target detection stage, we explore a feature representation scheme for a small target that integrates handcrafted features, low-level deep features, and high-level deep features. Then, the correlation filter is used to roughly predict target location. In the bounding-box estimation stage, an instance-aware intersection over union (IoU)-Net is integrated together with an instance-aware attention network to estimate the target size based on the bounding-box proposals generated in the target detection stage. Extensive experimental results on the UAV123 and UAVDT datasets show that our tracker, running at over 25 frames per second (FPS), has superior performance as compared with state-of-the-art UAV visual tracking approaches.
Funder
National Natural Science Foundation of China
Subject
General Earth and Planetary Sciences
Reference39 articles.
1. Dcfnet: Discriminant correlation filters network for visual tracking;Wang;arXiv,2017
Cited by
19 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献