Binary Dense SIFT Flow Based Position-Information Added Two-Stream CNN for Pedestrian Action Recognition-Reference-Cited by-同舟云学术

Binary Dense SIFT Flow Based Position-Information Added Two-Stream CNN for Pedestrian Action Recognition

Published:2022-10-17 Issue:20 Volume:12 Page:10445
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Park Sang Kyoo^ORCID,Chung Jun Ho,Pae Dong Sung^ORCID,Lim Myo Taeg^ORCID

Abstract

Pedestrian behavior recognition in the driving environment is an important technology to prevent pedestrian accidents by predicting the next movement. It is necessary to recognize current pedestrian behavior to predict future pedestrian behavior. However, many studies have recognized human visible characteristics such as face, body parts or clothes, but few have recognized pedestrian behavior. It is challenging to recognize pedestrian behavior in the driving environment due to the changes in the camera field of view due to the illumination conditions in outdoor environments and vehicle movement. In this paper, to predict pedestrian behavior, we introduce a position-information added two-stream convolutional neural network (CNN) with multi task learning that is robust to the limited conditions of the outdoor driving environment. The conventional two-stream CNN is the most widely used model for human-action recognition. However, the conventional two-stream CNN based on optical flow has limitations regarding pedestrian behavior recognition in a moving vehicle because of the assumptions of brightness constancy and piecewise smoothness. To solve this problem for a moving vehicle, the binary descriptor dense scale-invariant feature transform (SIFT) flow, a feature-based matching algorithm, is robust in moving-pedestrian behavior recognition, such as walking and standing, in a moving vehicle. However, recognizing cross attributes, such as crossing or not crossing the street, is challenging using the binary descriptor dense SIFT flow because people who cross the road or not act the same walking action, but their location on the image is different. Therefore, pedestrian position information should be added to the conventional binary descriptor dense SIFT flow two-stream CNN. Thus, learning biased toward action attributes is evenly learned across action and cross attributes. In addition, YOLO detection and the Siamese tracker are used instead of the ground-truth boundary box to prove the robustness in the action- and cross-attribute recognition from a moving vehicle. The JAAD and PIE datasets were used for training, and only the JAAD dataset was used as a testing dataset for comparison with other state-of-the-art research on multitask and single-task learning.

Funder

National Research Foundation of Korea

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/12/20/10445/pdf

Reference46 articles.

1. Autonomous Vehicles on the Edge: A Survey on Autonomous Vehicle Racing

2. Self-driving cars: A survey

3. A Survey on 3D Object Detection Methods for Autonomous Driving Applications

4. Autonomous Vehicles: Autodriver Algorithm and Vehicle Dynamics

5. A Review of Vehicle Detection Techniques for Intelligent Vehicles

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. PIT: Progressive Interaction Transformer for Pedestrian Crossing Intention Prediction;IEEE Transactions on Intelligent Transportation Systems;2023-12

2. A Novel Two-Stream Transformer-Based Framework for Multi-Modality Human Action Recognition;Applied Sciences;2023-02-05

3. RLSTM: A Novel Residual and Recurrent Network for Pedestrian Action Classification;Computer Analysis of Images and Patterns;2023

4. Self-Supervised Video Representation and Temporally Adaptive Attention for Audio-Visual Event Localization;Applied Sciences;2022-12-09