Affiliation:
1. School of Electronics and Telecommunications, Hanoi University of Science and Technology, Hanoi, Vietnam
2. Viet-Hung Industry University, Hanoi, Vietnam
3. Faculty of Electrical-Electronic Engineering, University of Transport and Communications, Hanoi, Vietnam
4. International Research Institute MICA, Hanoi University of Science and Technology, Hanoi, Vietnam
Abstract
Fusion techniques with the aim to leverage the discriminative power of different appearance features for person representation have been widely applied in person re-identification. They are performed by concatenating all feature vectors (known as early fusion) or by combining matching scores of different classifiers (known as late fusion). Previous studies have proved that late fusion techniques achieve better results than early fusion ones. However, majority of the studies focus on determining the suitable weighting schemes that can reflect the role of each feature. The determined weights are then integrated in conventional similarity functions, such as Cosine [L. Zheng, S. Wang, L. Tian, F. He, Z. Liu and Q. Tian, Queryadaptive late fusion for image search and person reidentification, in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2015, pp. 1741–1750]. The contribution of this paper is two-fold. First, a robust person re-identification method by combining the metric learning with late fusion techniques is proposed. The metric learning method Cross-view Quadratic Discriminant Analysis (XQDA) is employed to learn a discriminant low dimensional subspace to minimize the intra-person distance while maximize the inter-person distance. Moreover, product rule-based and sum rule-based late fusion techniques are applied on these distances. Second, concerning feature engineering, the ResNet extraction process has been modified in order to extract local features of different stripes in person images. To show the effectiveness of the proposed method, both single-shot and multi-shot scenarios are considered. Three state-of-the-art features that are Gaussians of Gaussians (GOG), Local Maximal Occurrence (LOMO) and deep-learned features extracted through a Residual network (ResNet) are extracted from person images. The experimental results on three benchmark datasets that are iLIDS-VID, PRID-2011 and VIPeR show that the proposed method [Formula: see text]% [Formula: see text]% of improvement over the best results obtained with the single feature. The proposed method that achieves the accuracy of 85.73%, 93.82% and 50.85% at rank-1 for iLIDS-VID, PRID-2011 and VIPeR, respectively, outperforms different SOTA methods including deep learning ones. Source code is publicly available to facilitate the development of person re-ID system.
Funder
Vietnam Ministry of Education and Training
Publisher
World Scientific Pub Co Pte Lt
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献