Multi-view region-adaptive multi-temporal DMM and RGB action recognition-Reference-Cited by-同舟云学术

Multi-view region-adaptive multi-temporal DMM and RGB action recognition

Published:2020-04-21 Issue:4 Volume:23 Page:1587-1602
ISSN:1433-7541
Container-title:Pattern Analysis and Applications
language:en
Short-container-title:Pattern Anal Applic

Author:

Al-Faris Mahmoud,Chiverton John P.^ORCID,Yang Yanyan,Ndzi David

Abstract

AbstractHuman action recognition remains an important yet challenging task. This work proposes a novel action recognition system. It uses a novel multi-view region-adaptive multi-resolution-in-time depth motion map (MV-RAMDMM) formulation combined with appearance information. Multi-stream 3D convolutional neural networks (CNNs) are trained on the different views and time resolutions of the region-adaptive depth motion maps. Multiple views are synthesised to enhance the view invariance. The region-adaptive weights, based on localised motion, accentuate and differentiate parts of actions possessing faster motion. Dedicated 3D CNN streams for multi-time resolution appearance information are also included. These help to identify and differentiate between small object interactions. A pre-trained 3D-CNN is used here with fine-tuning for each stream along with multi-class support vector machines. Average score fusion is used on the output. The developed approach is capable of recognising both human action and human–object interaction. Three public-domain data-sets, namely MSR 3D Action, Northwestern UCLA multi-view actions and MSR 3D daily activity, are used to evaluate the proposed solution. The experimental results demonstrate the robustness of this approach compared with state-of-the-art algorithms.

Funder

Higher Committee for Education Development in Iraq

Google

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition

Link

https://link.springer.com/content/pdf/10.1007/s10044-020-00886-5.pdf

Reference72 articles.

1. Park S, Kim D (2018) Video surveillance system based on 3d action recognition. In: 2018 Tenth international conference on ubiquitous and future networks (ICUFN). IEEE, pp 868–870

2. Li Z, Tang J, Mei T (2018) Deep collaborative embedding for social image understanding. IEEE Trans Pattern Anal Mach Intell 41(9):2070–2083