A Preliminary Study of Deep Learning Sensor Fusion for Pedestrian Detection-Reference-Cited by-同舟云学术

A Preliminary Study of Deep Learning Sensor Fusion for Pedestrian Detection

Published:2023-04-21 Issue:8 Volume:23 Page:4167
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Plascencia Alfredo Chávez¹^ORCID,García-Gómez Pablo²^ORCID,Perez Eduardo Bernal¹^ORCID,DeMas-Giménez Gerard¹^ORCID,Casas Josep R.³^ORCID,Royo Santiago¹²^ORCID

Affiliation:

1. Centre for Sensors, Instrumentation and Systems Development (CD6), Polytechnic University of Catalonia (UPC), Rambla de Sant Nebridi 10, 08222 Terrassa, Spain

2. Beamagine S.L. Carrer de Bellesguard 16, 08755 Castellbisbal, Spain

3. Image Processing Group, TSC Department, Polytechnic University of Catalonia (UPC), Carrer de Jordi Girona 1-3, 08034 Barcelona, Spain

Abstract

Most pedestrian detection methods focus on bounding boxes based on fusing RGB with lidar. These methods do not relate to how the human eye perceives objects in the real world. Furthermore, lidar and vision can have difficulty detecting pedestrians in scattered environments, and radar can be used to overcome this problem. Therefore, the motivation of this work is to explore, as a preliminary step, the feasibility of fusing lidar, radar, and RGB for pedestrian detection that potentially can be used for autonomous driving that uses a fully connected convolutional neural network architecture for multimodal sensors. The core of the network is based on SegNet, a pixel-wise semantic segmentation network. In this context, lidar and radar were incorporated by transforming them from 3D pointclouds into 2D gray images with 16-bit depths, and RGB images were incorporated with three channels. The proposed architecture uses a single SegNet for each sensor reading, and the outputs are then applied to a fully connected neural network to fuse the three modalities of sensors. Afterwards, an up-sampling network is applied to recover the fused data. Additionally, a custom dataset of 60 images was proposed for training the architecture, with an additional 10 for evaluation and 10 for testing, giving a total of 80 images. The experiment results show a training mean pixel accuracy of 99.7% and a training mean intersection over union of 99.5%. Also, the testing mean of the IoU was 94.4%, and the testing pixel accuracy was 96.2%. These metric results have successfully demonstrated the effectiveness of using semantic segmentation for pedestrian detection under the modalities of three sensors. Despite some overfitting in the model during experimentation, it performed well in detecting people in test mode. Therefore, it is worth emphasizing that the focus of this work is to show that this method is feasible to be used, as it works regardless of the size of the dataset. Also, a bigger dataset would be necessary to achieve a more appropiate training. This method gives the advantage of detecting pedestrians as the human eye does, thereby resulting in less ambiguity. Additionally, this work has also proposed an extrinsic calibration matrix method for sensor alignment between radar and lidar based on singular value decomposition.

Funder

European Union’s Horizon 2020 research and innovation program

European Union’s NextGeneration EU/PRTR and the Government of Catalonia’s Agency for Business Competitiveness

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/23/8/4167/pdf

Reference35 articles.

1. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions;Alzubaidi;J. Big Data,2021

2. Bimbraw, K. (2015, January 21–23). Autonomous cars: Past, present and future a review of the developments in the last century, the present scenario and the expected future of autonomous vehicle technology. Proceedings of the 2015 12th International Conference on Informatics in Control, Automation and Robotics (ICINCO), Colmar, France.

3. A review of Convolutional-Neural-Network-based action recognition;Yao;Pattern Recognit. Lett.,2019

4. Soga, M., Kato, T., Ohta, M., and Ninomiya, Y. (2005, January 3–4). Pedestrian Detection with Stereo Vision. Proceedings of the 21st International Conference on Data Engineering Workshops (ICDEW’05), Tokyo, Japan.

5. Yu, X., and Marinov, M. (2020). A Study on Recent Developments and Issues with Obstacle Detection Systems for Automated Vehicles. Sustainability, 12.

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Switched Approach for Smartphone-Based Pedestrian Navigation;Sensors;2024-08-14

2. Research on Pedestrian Detection Based on Jetson Xavier NX Platform and YOLOv4;2023 4th International Symposium on Computer Engineering and Intelligent Communications (ISCEIC);2023-08-18

3. A Preliminary Study of Deep Learning Sensor Fusion for Pedestrian Detection;Sensors;2023-04-21