Abstract
AbstractViolent action classification in community-based surveillance is a particularly challenging concept in itself. The ambiguity of violence as a complex action can lead to the misclassification of violence-related crimes in detection models and the increased complexity of intelligent surveillance systems leading to greater costs in operations or cost of lives. This paper demonstrates a novel approach to performing automatic violence detection by considering violence as complex actions mitigating oversimplification or overgeneralization of detection models. The proposed work supports the notion that violence is a complex action and is classifiable through decomposition into more identifiable actions that could be easily recognized by human action recognition algorithms. A two-stage framework was designed to detect simple actions which are sub-concepts of violence in a two-stream action recognition architecture. Using a basic logistic regression layer, simple actions were further classified as complex actions for violence detection. Varying configurations of the work were tested, such as applying action silhouettes, varying activation caching sizes, and different pooling methods for post-classification smoothing. The framework was evaluated considering accuracy, recall, and operational speed considering its implications in community deployment. The experimental results show that the developed framework reaches 21 FPS operation speeds for real-time operations and 11 FPS for non-real-time operations. Using the proposed variable caching algorithm, median pooling results in accuracy reaching 83.08% and 80.50% for non-real-time and real-time operations. In comparison, applying max pooling results to recalls reached 89.55% and 84.93% for non-real-time and real-time operations, respectively. This paper shows that complex action decomposition is deemed to be an appropriate method through the comparable performance with existing efforts that have not considered violence as complex actions implying a new perspective for automatic violence detection in intelligent surveillance systems.
Funder
National Science and Technology Council
Publisher
Springer Science and Business Media LLC
Reference70 articles.
1. Abdali AMR, Al-Tuma RF (2019) Robust Real-time violence detection in video using CNN And LSTM. 2019 2nd scientific conference of computer sciences (SCCS). p 104–108
2. Acar E, Hopfgartner F, Albayrak S (2016) Breaking down violence detection: combining divide-et-impera and coarse-to-fine strategies. Neurocomputing 208:225–237
3. Accattoli S, Sermani P, Falcionelli N, Mekuria DN, Dragoni AF (2020) Violence detection in videos by combining 3D convolutional neural networks and support vector machines. Appl Artif Intell 34(4):329–344
4. Ali A, Taylor GW (2018) Real-time end-to-end action detection with two-stream networks. 15th conference on computer and robot vision, CRV 2018. p 31–38
5. Baba M, Gui V, Cernazanu C, Pescaru D (2019) A sensor network approach for violence detection in smart cities using deep learning. Sen (switzerland) 19(7):1–17
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献