Abstract
Most existing violence recognition methods have complex network structures and high cost of computation and cannot meet the requirements of large-scale deployment. The purpose of this paper is to reduce the complexity of the model to realize the application of violence recognition on mobile intelligent terminals. To solve this problem, we propose MobileNet-TSM, a lightweight network, which uses MobileNet-V2 as main structure. By incorporating temporal shift modules (TSM), which can exchange information between frames, the capability of extracting dynamic characteristics between consecutive frames is strengthened. Extensive experiments are conducted to prove the validity of this method. Our proposed model has only 8.49MB parameters and 175.86MB estimated total size. Compared with the existing methods, this method greatly reduced the model size, at the cost of an accuracy gap of about 3%. The proposed model has achieved accuracy of 97.959%, 97.5% and 87.75% on three public datasets (Crowd Violence, Hockey Fights, and RWF-2000), respectively. Based on this, we also build a real-time violence recognition application on the Android terminal. The source code and trained models are available on https://github.com/1840210289/MobileNet-TSM.git.
Funder
Basic Research Fund of the Engineering University of PAP
innovative research project on training objects of high-level scientific and technological talents of PAP
Publisher
Public Library of Science (PLoS)
Reference51 articles.
1. Violent scene detection in movies;LH Chen;International Journal of Pattern Recognition and Artificial Intelligence,2011
2. Giannakopoulos T, Pikrakis A, Theodoridis S. A multimodal approach to violence detection in video sharing sites. In: 2010 20th International Conference on Pattern Recognition. IEEE; 2010. p. 3244–3247.
3. Kim HD, Ahn SS, Kim KH, Choi JS. Single-channel particular voice activity detection for monitoring the violence situations. In: 2013 IEEE RO-MAN. IEEE; 2013. p. 412–417.
4. Hassner T, Itcher Y, Kliper-Gross O. Violent flows: Real-time detection of violent crowd behavior. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. IEEE; 2012. p. 1–6.
5. Vashistha P, Bhatnagar C, Khan MA. An architecture to identify violence in video surveillance system using ViF and LBP. In: 2018 4th international conference on recent advances in information technology (RAIT). IEEE; 2018. p. 1–6.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献