NRPerson: A Non-Registered Multi-Modal Benchmark for Tiny Person Detection and Localization-Reference-Cited by-同舟云学术

NRPerson: A Non-Registered Multi-Modal Benchmark for Tiny Person Detection and Localization

Published:2024-04-27 Issue:9 Volume:13 Page:1697
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Yang Yi¹,Han Xumeng¹,Wang Kuiran¹^ORCID,Yu Xuehui¹,Yu Wenwen¹,Wang Zipeng¹,Li Guorong¹^ORCID,Han Zhenjun¹^ORCID,Jiao Jianbin¹

Affiliation:

1. School of Electronic, Electrical, and Communication Engineering, University of Chinese Academy of Sciences, Beijing 101408, China

Abstract

In recent years, the detection and localization of tiny persons have garnered significant attention due to their critical applications in various surveillance and security scenarios. Traditional multi-modal methods predominantly rely on well-registered image pairs, necessitating the use of sophisticated sensors and extensive manual effort for registration, which restricts their practical utility in dynamic, real-world environments. Addressing this gap, this paper introduces a novel non-registered multi-modal benchmark named NRPerson, specifically designed to advance the field of tiny person detection and localization by accommodating the complexities of real-world scenarios. The NRPerson dataset comprises 8548 RGB-IR image pairs, meticulously collected and filtered from 22 video sequences, enriched with 889,207 high-quality annotations that have been manually verified for accuracy. Utilizing NRPerson, we evaluate several leading detection and localization models across both mono-modal and non-registered multi-modal frameworks. Furthermore, we develop a comprehensive set of natural multi-modal baselines for the innovative non-registered track, aiming to enhance the detection and localization of unregistered multi-modal data using a cohesive and generalized approach. This benchmark is poised to facilitate significant strides in the practical deployment of detection and localization technologies by mitigating the reliance on stringent registration requirements.

Publisher

MDPI AG

Link

https://www.mdpi.com/2079-9292/13/9/1697/pdf

Reference62 articles.

1. Dollár, P., Wojek, C., Schiele, B., and Perona, P. (2009, January 20–25). Pedestrian detection: A benchmark. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.

2. Zhang, S., Benenson, R., and Schiele, B. (2017, January 21–26). Citypersons: A diverse dataset for pedestrian detection. Proceedings of the In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.

3. Jia, X., Zhu, C., Li, M., Tang, W., and Zhou, W. (2021). LLVIP: A Visible-infrared Paired Dataset for Low-light Vision. arXiv.

4. KGSNet: Key-Point-Guided Super-Resolution Network for Pedestrian Detection in the Wild;Zhang;IEEE Trans. Neural Netw. Learn. Syst.,2021

5. Geiger, A., Lenz, P., and Urtasun, R. (2012, January 16–21). Are we ready for autonomous driving? The kitti vision benchmark suite. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA.