Affiliation:
1. School of Big Data and Computer Science Guizhou Normal University Guiyang China
2. Guizhou Key Laboratory of Information and Computing Science Guizhou Normal University Guiyang China
3. Engineering Laboratory for Applied Technology of Big Data in Education Guizhou Normal University Guiyang China
Abstract
SummaryIn an agnostic space environment, aerial person re‐identification (Re‐ID) is a task that the query person may not occur in the gallery set, it is considered a subordinate task within the domain of open‐world person Re‐ID, and is a more challenging and practical application research. The aerial person images, captured by unmanned aerial vehicles, present more significant challenges such as weak appearance features, fewer individual person samples and occlusion due to variations in camera height and viewing angles compared to ground‐level images. Most state‐of‐the‐arts person Re‐ID methods developed for open‐world datasets rely heavily on local convolutional neural networks but exhibit suboptimal performance when directly applied to aerial person Re‐ID tasks. In this article, a parameter instance learning based on vision transformers (ViT) model is introduced for the design of aerial person Re‐ID. Initially, we employ a self‐supervised paradigm grounded in parameter instance discrimination, aiming to capture feature alignment and instance similarity. Subsequently, using labeled training data, we optimize the network model through the calculation of two types of loss functions. Finally, we employ a feature enhancement strategy utilizing zero‐padding and displacement techniques. This strategy effectively and directly enhances the robustness of the ViT model against issues such as occlusion and misalignment. We conducted experiments on a Re‐ID dataset to validate the effectiveness of the method. Our approach achieves a mean average precision of 57.31% and a Rank‐1 accuracy of 65.29% on the aerial person Re‐ID dataset PRAI‐1581.
Funder
National Natural Science Foundation of China