Conv3D-Based Video Violence Detection Network Using Optical Flow and RGB Data-Reference-Cited by-同舟云学术

Conv3D-Based Video Violence Detection Network Using Optical Flow and RGB Data

Published:2024-01-05 Issue:2 Volume:24 Page:317
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Park Jae-Hyuk¹,Mahmoud Mohamed¹²^ORCID,Kang Hyun-Soo¹^ORCID

Affiliation:

1. Department of Information and Communication Engineering, School of Electrical and Computer Engineering, Chungbuk National University, Cheongju-si 28644, Republic of Korea

2. Information Technology Department, Faculty of Computers and Information, Assiut University, Assiut 71515, Egypt

Abstract

Detecting violent behavior in videos to ensure public safety and security poses a significant challenge. Precisely identifying and categorizing instances of violence in real-life closed-circuit television, which vary across specifications and locations, requires comprehensive understanding and processing of the sequential information embedded in these videos. This study aims to introduce a model that adeptly grasps the spatiotemporal context of videos within diverse settings and specifications of violent scenarios. We propose a method to accurately capture spatiotemporal features linked to violent behaviors using optical flow and RGB data. The approach leverages a Conv3D-based ResNet-3D model as the foundational network, capable of handling high-dimensional video data. The efficiency and accuracy of violence detection are enhanced by integrating an attention mechanism, which assigns greater weight to the most crucial frames within the RGB and optical-flow sequences during instances of violence. Our model was evaluated on the UBI-Fight, Hockey, Crowd, and Movie-Fights datasets; the proposed method outperformed existing state-of-the-art techniques, achieving area under the curve scores of 95.4, 98.1, 94.5, and 100.0 on the respective datasets. Moreover, this research not only has the potential to be applied in real-time surveillance systems but also promises to contribute to a broader spectrum of research in video analysis and understanding.

Funder

Technology development Program of MSS

MSIT (Ministry of Science and ICT), Korea

Publisher

MDPI AG

Link

https://www.mdpi.com/1424-8220/24/2/317/pdf

Reference47 articles.

1. Adaptive background mixture models for real-time tracking;Stauffer;Proceedings of the 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149),1999

2. Distinctive image features from scale-invariant keypoints;Lowe;Int. J. Comput. Vis.,2004

3. On space-time interest points;Laptev;Int. J. Comput. Vis.,2005

4. Hassner, T., Itcher, Y., and Kliper-Gross, O. (2012, January 16–21). Violent flows: Real-time detection of violent crowd behavior. Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Providence, RI, USA.

5. Bank, D., Koenigstein, N., and Giryes, R. (2023). Machine Learning for Data Science Handbook: Data Mining and Knowledge Discovery Handbook, Springer.

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Elevating urban surveillance: A deep CCTV monitoring system for detection of anomalous events via human action recognition;Sustainable Cities and Society;2024-11

2. Enhancing public safety: a hybrid Conv_Trans-OptBiSVM approach for real-time abnormal behavior detection in crowded environments;Signal, Image and Video Processing;2024-09-04

3. F3DNN-Net: behaviours violence detection via fine-tuned fused feature based deep neural network from surveillance video;Signal, Image and Video Processing;2024-08-29

4. Comparative Analysis of Movement Segmentation Techniques in Untrimmed Videos Using Optical Flow and Frame Differencing Using the $1 Unistroke Recognizer;2024 Intelligent Methods, Systems, and Applications (IMSA);2024-07-13

5. Cross-Modality Interaction-Based Traffic Accident Classification;Applied Sciences;2024-02-27