Research on Multi-Modal Pedestrian Detection and Tracking Algorithm Based on Deep Learning-Reference-Cited by-同舟云学术

Research on Multi-Modal Pedestrian Detection and Tracking Algorithm Based on Deep Learning

Published:2024-05-31 Issue:6 Volume:16 Page:194
ISSN:1999-5903
Container-title:Future Internet
language:en
Short-container-title:Future Internet

Author:

Zhao Rui¹^ORCID,Hao Jutao²,Huo Huan¹

Affiliation:

1. Faculty of Engineering and IT, University of Technology Sydney, Ultimo 2007, Australia

2. School of Electric Information Engineering, Shanghai Dianji University, Shuihua Rd., Shanghai 201306, China

Abstract

In the realm of intelligent transportation, pedestrian detection has witnessed significant advancements. However, it continues to grapple with challenging issues, notably the detection of pedestrians in complex lighting scenarios. Conventional visible light mode imaging is profoundly affected by varying lighting conditions. Under optimal daytime lighting, visibility is enhanced, leading to superior pedestrian detection outcomes. Conversely, under low-light conditions, visible light mode imaging falters due to the inadequate provision of pedestrian target information, resulting in a marked decline in detection efficacy. In this context, infrared light mode imaging emerges as a valuable supplement, bolstering pedestrian information provision. This paper delves into pedestrian detection and tracking algorithms within a multi-modal image framework grounded in deep learning methodologies. Leveraging the YOLOv4 algorithm as a foundation, augmented by a channel stack fusion module, a novel multi-modal pedestrian detection algorithm tailored for intelligent transportation is proposed. This algorithm capitalizes on the fusion of visible and infrared light mode image features to enhance pedestrian detection performance amidst complex road environments. Experimental findings demonstrate that compared to the Visible-YOLOv4 algorithm, renowned for its high performance, the proposed Double-YOLOv4-CSE algorithm exhibits a notable improvement, boasting a 5.0% accuracy rate enhancement and a 6.9% reduction in logarithmic average missing rate. This research’s goal is to ensure that the algorithm can run smoothly even on a low configuration 1080 Ti GPU and to improve the algorithm’s coverage at the application layer, making it affordable and practical for both urban and rural areas. This addresses the broader research problem within the scope of smart cities and remote ends with limited computational power.

Publisher

MDPI AG

Link

https://www.mdpi.com/1999-5903/16/6/194/pdf

Reference25 articles.

1. Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27–30). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.

2. Optimized aggregated channel features pedestrian detection algorithm based on binocular vision;Jin;Tianjin Daxue Xuebao (Ziran Kexue Yu Gongcheng Jishu Ban)/J. Tianjin Univ. Sci. Technol.,2016

3. Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21–26). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.

4. Konig, D., Adam, M., Jarvers, C., Layher, G., Neumann, H., and Teutsch, M. (2017, January 21–26). Fully convolutional region proposal networks for multispectral person detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA.

5. Research on high pressure vessel detection technology based on infrared image fusion algorithms;Gen;Chin. Meas. Test,2021