Affiliation:
1. Beijing Engineering Research Center for IoT Software and Systems, Beijing University of Technology, Beijing 100124, P. R. China
Abstract
This work presents the deep learning networks-based method using fine-tuning for classification and search of a diversity of action videos. First, a 3D convolutional neural networks (3D CNN) model which performs pre-training operation and fine-tuning strategy is employed to extract the spatiotemporal features of videos. It is first pre-trained on UCF-101 datasets to train model with initial parameters. Then, a small new dataset is employed to fine-tune the initial model for the training of the new model. Once features are extracted by the final CNNs model, distance measure can be adopted to calculate the similarities between the query video and the test dataset for the video search. The searched video is returned and ranked according to the priority when it has higher similarity with the query video. The comparison results in the experiment shows that the search method using fine-tuning obtains better performance than the method without using fine-tuning. Second, the classification results based on the 3D CNN model using fine-tuning are also presented for the consideration of a query by keyword. Accuracy result obtained using the model with the help of fine-tuning is approximately 2.8% higher than that without using fine-tuning.
Publisher
World Scientific Pub Co Pte Lt
Subject
Artificial Intelligence,Computer Vision and Pattern Recognition,Software
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献