Abstract
While recent object trackers, which employ segmentation methods for bounding box estimation, have achieved significant advancements in tracking accuracy, they are still limited in their ability to accommodate geometric transformations. This limitation results in poor performance over long sequences in aerial object-tracking applications. To mitigate this problem, we propose a novel real-time tracking framework consisting of deformation modules. These modules model geometric variations and appearance changes at different levels for segmentation purposes. Specifically, the proposal deformation module produces a local tracking region by learning a geometric transformation from the previous state. By decomposing the target representation into templates corresponding to parts of the object, the kernel deformation module performs local cross-correlation in a computationally and parameter-efficient manner. Additionally, we introduce a mask deformation module to increase tracking flexibility by choosing the most important correlation kernels adaptively. Our final segmentation tracker achieves state-of-the-art performance on six tracking benchmarks, producing segmentation masks and rotated bounding boxes at over 60 frames per second.