Abstract
In the era of digital media, the rapidly increasing volume and complexity of multimedia data cause many problems in storing, processing, and querying information in a reasonable time. Feature extraction and processing time play an extremely important role in large-scale video retrieval systems and currently receive much attention from researchers. We, therefore, propose an efficient approach to feature extraction on big video datasets using deep learning techniques. It focuses on the main features, including subtitles, speeches, and objects in video frames, by using a combination of three techniques: optical character recognition (OCR), automatic speech recognition (ASR), and object identification with deep learning techniques. We provide three network models developed from networks of Faster R-CNN ResNet, Faster R-CNN Inception ResNet V2, and Single Shot Detector MobileNet V2. The approach is implemented in Spark, the next-generation parallel and distributed computing environment, which reduces the time and space costs of the feature extraction process. Experimental results show that our proposal achieves an accuracy of 96% and a processing time reduction of 50%. This demonstrates the feasibility of the approach for content-based video retrieval systems in a big data context.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference54 articles.
1. Lecture video indexing and analysis using video ocr technology;Yang;Proceedings of the 2011 Seventh International Conference on Signal Image Technology & Internet-Based Systems,2011
2. Content based video retrieval based on bounded coordinate of motion histogram;El Ouadrhiri;Proceedings of the 2017 4th International Conference on Control, Decision and Information Technologies (CoDIT),2017
3. Violence Detection in Videos by Combining 3D Convolutional Neural Networks and Support Vector Machines
4. Audio-visual embedding for cross-modal music video retrieval through supervised deep CCA;Zeng;Proceedings of the 2018 IEEE International Symposium on Multimedia (ISM),2018
5. Speed/accuracy trade-offs for modern convolutional object detectors;Huang;Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017
Cited by
16 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献