Learning to Propose and Refine for Accurate and Robust Tracking via an Alignment Convolution
Affiliation:
1. The Guangxi Key Laboratory of Multi-Source Information Mining & Security, Guangxi Normal University, Guilin 541004, China 2. Guangxi Key Laboratory of Machine Vision and Intelligent Control, Wuzhou University, Wuzhou 543002, China
Abstract
Precise and robust feature extraction plays a key role in high-performance tracking to analyse the videos from drones, surveillance and automatic driving, etc. However, most existing Siamese network-based trackers mainly focus on constructing complicated network models and refinement strategies, while using comparatively simple and heuristic conventional or deformable convolutions to extract features from the sampling positions that may be far away from a target region. Consequently, the coarsely extracted features may introduce background noise and degrade the tracking performance. To address this issue, we present a propose-and-refine tracker (PRTracker) that combines anchor-free style proposals at the coarse level, and alignment convolution-driven refinement at the fine level. Specifically, at the coarse level, we design an anchor-free model to effectively generate proposals that provide more reliable interested regions for further verifying. At the fine level, an alignment convolution-based refinement strategy is adopted to improve the convolutional sampling positions of the proposals, thus making the classification and regression of them more accurate. Through using alignment convolution, the convolution sampling positions of the proposals can be efficiently and effectively re-localized, thus improving the accuracy of the extracted features. Finally, a simple yet robust target mask is designed to make full use of the initial state of a target to further improve the tracking performance. The proposed PRTracker achieves a competitive performance against six tracking benchmarks (i.e., UAV123, VOT2018, VOT2019, OTB100, NfS and LaSOT) at 75 FPS.
Funder
Guangxi ”Bagui Scholar” Teams Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing Guangxi Talent Highland Project of Big Data Intelligence and Application
Subject
Artificial Intelligence,Computer Science Applications,Aerospace Engineering,Information Systems,Control and Systems Engineering
Reference65 articles.
1. SiamBAN: Target-aware tracking with siamese box adaptive network;Chen;IEEE Trans. Pattern Anal. Mach. Intell.,2023 2. Molchanov, P., Yang, X., Gupta, S., Kim, K., Tyree, S., and Kautz, J. (July, January 26). Online Detection and Classification of Dynamic Hand Gestures With Recurrent 3D Convolutional Neural Network. Proceedings of the CVPR 2016, Las Vegas, NV, USA. 3. On-Road Pedestrian Tracking Across Multiple Driving Recorders;Lee;IEEE Trans. Multimed.,2015 4. Tang, S., Andriluka, M., Andres, B., and Schiele, B. (2017, January 22–25). Multiple People Tracking by Lifted Multicut and Person Re-identification. Proceedings of the CVPR 2017, Honolulu, HI, USA. 5. Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., and Torr, P.H. (2016, January 11–14). Fully-convolutional siamese networks for object tracking. Proceedings of the European Conference on Computer Vision Workshops, Amsterdam, The Netherlands.
|
|