Author:
Song Kwangho,Kim Yoo-Sung
Abstract
An enhanced multimodal stacking scheme is proposed for quick and accurate online detection of harmful pornographic contents on the Internet. To accurately detect harmful contents, the implicative visual features (auditory features) are extracted using a bi-directional RNN (recurrent neural network) with VGG-16 (a multilayered dilated convolutional network) to implicitly express the signal change patterns over time within each input. Using only the implicative visual and auditory features, a video classifier and an audio classifier are trained, respectively. By using both features together, one fusion classifier is also trained. Then, these three component classifiers are stacked in the enhanced ensemble scheme to reduce the false negative errors in a serial order of the fusion classifier, video classifier, and audio classifier for a quick online detection. The proposed multimodal stacking scheme yields an improved true positive rate of 95.40% and a false negative rate of 4.60%, which are superior values to previous studies. In addition, the proposed stacking scheme can accurately detect harmful contents up to 74.58% and an average rate of 62.16% faster than the previous stacking scheme. Therefore, the proposed enhanced multimodal stacking scheme can be used to quickly and accurately filter out harmful contents in the online environments.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference30 articles.
1. Applying deep learning to classify pornographic images and videos;Moustaf;arXiv,2015
2. Pornography classification: The hidden clues in video space–time
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献