Author:
Trelinski Jacek,Kwolek Bogdan
Abstract
AbstractIn this work, we present a new algorithm for human action recognition on raw depth maps. At the beginning, for each class we train a separate one-against-all convolutional neural network (CNN) to extract class-specific features representing person shape. Each class-specific, multivariate time-series is processed by a Siamese multichannel 1D CNN or a multichannel 1D CNN to determine features representing actions. Afterwards, for the nonzero pixels representing the person shape in each depth map we calculate statistical features. On multivariate time-series of such features we determine Dynamic Time Warping (DTW) features. They are determined on the basis of DTW distances between all training time-series. Finally, each class-specific feature vector is concatenated with the DTW feature vector. For each action category we train a multiclass classifier, which predicts probability distribution of class labels. From pool of such classifiers we select a number of classifiers such that an ensemble built on them achieves the best classification accuracy. Action recognition is performed by a soft voting ensemble that averages distributions calculated by such classifiers with the largest discriminative power. We demonstrate experimentally that on MSR-Action3D and UTD-MHAD datasets the proposed algorithm attains promising results and outperforms several state-of-the-art depth-based algorithms.
Publisher
Springer Science and Business Media LLC
Subject
Artificial Intelligence,Software
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献