Affiliation:
1. School of Mathematics and Statistics, Chongqing Jiaotong University, Chongqing 400074, China
2. School of Information Science and Engineering, Chongqing Jiaotong University, Chongqing 400074, China
Abstract
With the rapid growth of the network user base and the number of short videos, a large number of videos related to terrorism and violence have emerged in the Internet, which has brought great challenges to the governance of the network environment. At present, most short-video platforms still adopt manual-review and user-report mechanisms to filter videos related to terrorism and violence, which cannot adapt to the development trend of short-video business in terms of recognition accuracy and timeliness. In the single-mode recognition method of violent video, this paper mainly studies the scene recognition mode. Firstly, the U-Net network is improved with the SE-block module. After pretraining on the Cityscapes dataset, semantic segmentation of video frames is carried out. On this basis, semantic features of scenes are extracted using the VGG16 network loaded with ImageNet pretraining weights. SE-U-Net-VGG16 scene recognition model is constructed. The experimental results show that the prediction accuracy of SE-U-Net model is much higher than that of the FCN model and U-Net model. SE-U-Net model has significant advantages in the modal research of scene recognition.
Funder
Group Building Scientific Innovation Project for Universities in Chongqing
Subject
Computer Networks and Communications,Computer Science Applications
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献