Affiliation:
1. Institute of Computing, University of Campinas, Campinas-SP, 13083-852, Brazil
Abstract
Due to rapid advances in the development of surveillance cameras with high sampling rates, low cost, small size and high resolution, video-based action recognition systems have become more commonly used in various computer vision applications. Human operators can be supported with the aid of such systems to detect events of interest in video sequences, improving recognition results and reducing failure cases. In this work, we propose and evaluate a method to learn two-dimensional (2D) representations from video sequences based on an autoencoder framework. Spatial and temporal information is explored through a multi-stream convolutional neural network in the context of human action recognition. Experimental results on the challenging UCF101 and HMDB51 datasets demonstrate that our representation is capable of achieving competitive accuracy rates when compared to other approaches available in the literature.
Funder
Funda de Amparo Pesquisa do Estado de São Paulo
Conselho Nacional de Desenvolvimento Cientfico e Tecnolgico
Publisher
World Scientific Pub Co Pte Lt
Subject
Artificial Intelligence,Computer Vision and Pattern Recognition,Software
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献