Affiliation:
1. School of Computer Science and Technology, Xi’an University of Science and Technology, Xi’an 710054, China
Abstract
Video anomaly detection is a critical component of intelligent video surveillance systems, extensively deployed and researched in industry and academia. However, existing methods have a strong generalization ability for predicting anomaly samples. They cannot utilize high-level semantic and temporal contextual information in videos, resulting in unstable prediction performance. To alleviate this issue, we propose an encoder–decoder model named SMAMS, based on spatiotemporal masked autoencoder and memory modules. First, we represent and mask some of the video events using spatiotemporal cubes. Then, the unmasked patches are inputted into the spatiotemporal masked autoencoder to extract high-level semantic and spatiotemporal features of the video events. Next, we add multiple memory modules to store unmasked video patches of different feature layers. Finally, skip connections are introduced to compensate for crucial information loss caused by the memory modules. Experimental results show that the proposed method outperforms state-of-the-art methods, achieving AUC scores of 99.9%, 94.8%, and 78.9% on the UCSD Ped2, CUHK Avenue, and Shanghai Tech datasets.
Funder
Chinese Postdoctoral Science Foundation
Reference42 articles.
1. Cong, Y., Yuan, J., and Liu, J. (2021, January 20–25). Sparse reconstruction cost for abnormal event detection. Proceedings of the CVPR 2011, Colorado Springs, CO, USA.
2. Mahadevan, V., Li, W., Bhalodia, V., and Vasconcelos, N. (2010, January 13–18). Anomaly detection in crowded scenes. Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA.
3. Gong, D., Liu, L., Le, V., Saha, B., Mansour, M.R., Venkatesh, S., and Hengel, A.v.d. (November, January 27). Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea.
4. Attention-based residual autoencoder for video anomaly detection;Le;Appl. Intell.,2023
5. Wei, H., Li, K., Li, H., Lyu, Y., and Hu, X. (2019, January 23–25). Detecting video anomaly with a stacked convolutional LSTM framework. Proceedings of the Computer Vision Systems: 12th International Conference, ICVS 2019, Thessaloniki, Greece. Proceedings 12.
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献