Author:
Akan Sara,Varlı Songül,Bhuiyan Mohammad Alfrad Nobel
Abstract
AbstractThe re-identification (ReID) of objects in images is a widely studied topic in computer vision, with significant relevance to various applications. The ReID of players in broadcast videos of team sports is the focus of this study. We specifically focus on identifying the same player in images taken at any given moment during a game from various camera angles. This work varies from other person ReID apps since the same team wears very similar clothes, there are few samples for each identification, and image resolutions are low. One of the hardest parts of object ReID is robust feature representation extraction. Despite the great success of current convolutional neural network-based (CNN) methods, most studies only consider learning representations from images, neglecting long-range dependency. Transformer-based model studies are increasing and yielding encouraging results. Transformers still have trouble extracting features from small objects and visual cues. To address these issues, we enhanced the Swin Transformer with the levering of CNNs. We created a regional feature extraction Swin Transformer (RFES) backbone to increase local feature extraction and small-scale object feature extraction. We also use three loss functions to handle imbalanced data and highlight challenging situations. Re-ranking with k-reciprocal encoding was used in this study's retrieval phase, and its assessment findings were provided. Finally, we conducted experiments on the Market-1501 and SoccerNet-v3 ReID datasets. Experimental results show that the proposed re-ID method reaches rank-1 accuracy of 96.2% with mAP: 89.1 and rank-1 accuracy of 84.1% with mAP: 86.7 on the Market-1501 and SoccerNet-v3 datasets, respectively, outperforming the state-of-the-art approaches.
Publisher
Springer Science and Business Media LLC
Reference65 articles.
1. Li, G., Xu, S., Liu, X., Li, L. & Wang, C. Jersey number recognition with semi-supervised spatial transformer network. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops 1783–1790 (2018).
2. Nady, A. & Hemayed, E. E. Player identification in different sports. in VISIGRAPP (5: VISAPP) 653–660 (2021).
3. Liu, H. & Bhanu, B. Pose-guided R-CNN for jersey number recognition in sports. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops 0 (2019).
4. Sun, Y., Zheng, L., Deng, W. & Wang, S. Svdnet for pedestrian retrieval. in Proceedings of the IEEE International Conference on Computer Vision 3800–3808 (2017).
5. Hermans, A., Beyer, L. & Leibe, B. In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017).