Human Activity Recognition Using Cascaded Dual Attention CNN and Bi-Directional GRU Framework-Reference-Cited by-同舟云学术

Human Activity Recognition Using Cascaded Dual Attention CNN and Bi-Directional GRU Framework

Published:2023-06-26 Issue:7 Volume:9 Page:130
ISSN:2313-433X
Container-title:Journal of Imaging
language:en
Short-container-title:J. Imaging

Author:

Ullah Hayat¹^ORCID,Munir Arslan¹^ORCID

Affiliation:

1. Department of Computer Science, Kansas State University, Manhattan, KS 66506, USA

Abstract

Vision-based human activity recognition (HAR) has emerged as one of the essential research areas in video analytics. Over the last decade, numerous advanced deep learning algorithms have been introduced to recognize complex human actions from video streams. These deep learning algorithms have shown impressive performance for the video analytics task. However, these newly introduced methods either exclusively focus on model performance or the effectiveness of these models in terms of computational efficiency, resulting in a biased trade-off between robustness and computational efficiency in their proposed methods to deal with challenging HAR problem. To enhance both the accuracy and computational efficiency, this paper presents a computationally efficient yet generic spatial–temporal cascaded framework that exploits the deep discriminative spatial and temporal features for HAR. For efficient representation of human actions, we propose an efficient dual attentional convolutional neural network (DA-CNN) architecture that leverages a unified channel–spatial attention mechanism to extract human-centric salient features in video frames. The dual channel–spatial attention layers together with the convolutional layers learn to be more selective in the spatial receptive fields having objects within the feature maps. The extracted discriminative salient features are then forwarded to a stacked bi-directional gated recurrent unit (Bi-GRU) for long-term temporal modeling and recognition of human actions using both forward and backward pass gradient learning. Extensive experiments are conducted on three publicly available human action datasets, where the obtained results verify the effectiveness of our proposed framework (DA-CNN+Bi-GRU) over the state-of-the-art methods in terms of model accuracy and inference runtime across each dataset. Experimental results show that the DA-CNN+Bi-GRU framework attains an improvement in execution time up to 167× in terms of frames per second as compared to most of the contemporary action-recognition methods.

Funder

Air Force Office of Scientific Research

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Vision and Pattern Recognition,Radiology, Nuclear Medicine and imaging

Link

https://www.mdpi.com/2313-433X/9/7/130/pdf

Reference114 articles.

1. Artificial Intelligence and Data Fusion at the Edge;Munir;IEEE Aerosp. Electron. Syst. Mag.,2021

2. FogSurv: A Fog-Assisted Architecture for Urban Surveillance Using Artificial Intelligence and Data Fusion;Munir;IEEE Access,2021

3. Abnormal Event Detection Using Deep Contrastive Learning for Intelligent Video Surveillance System;Huang;IEEE Trans. Ind. Inform.,2021

4. Together Recognizing, Localizing and Summarizing Actions in Egocentric Videos;Sahu;IEEE Trans. Image Process.,2021

5. Semantics-Sware Spatial–Temporal Binaries for Cross-Modal Video Retrieval;Qi;IEEE Trans. Image Process.,2021

Cited by 11 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A bidirectional Siamese recurrent neural network for accurate gait recognition using body landmarks;Neurocomputing;2024-11

2. Recognizing human activities with the use of Convolutional Block Attention Module;Egyptian Informatics Journal;2024-09

3. FineTea: A Novel Fine-Grained Action Recognition Video Dataset for Tea Ceremony Actions;Journal of Imaging;2024-08-31

4. Human Multi-Activities Classification Using mmWave Radar: Feature Fusion in Time-Domain and PCANet;Sensors;2024-08-22

5. Deep Learning for Abnormal Human Behavior Detection in Surveillance Videos—A Survey;Electronics;2024-06-30