Channel Attention-Based Approach with Autoencoder Network for Human Action Recognition in Low-Resolution Frames-Reference-Cited by-同舟云学术

Channel Attention-Based Approach with Autoencoder Network for Human Action Recognition in Low-Resolution Frames

Published:2024-01-04 Issue: Volume:2024 Page:1-22
ISSN:1098-111X
Container-title:International Journal of Intelligent Systems
language:en
Short-container-title:International Journal of Intelligent Systems

Author:

Dastbaravardeh Elaheh¹,Askarpour Somayeh²,Saberi Anari Maryam²^ORCID,Rezaee Khosro³^ORCID

Affiliation:

1. Department of Control Engineering, Islamic Azad University of Mashhad, Mashhad, Iran

2. Department of Computer Engineering, Technical and Vocational University (TVU), Tehran, Iran

3. Department of Biomedical Engineering, Meybod University, Meybod, Iran

Abstract

Action recognition (AR) has many applications, including surveillance, health/disabilities care, man-machine interactions, video-content-based monitoring, and activity recognition. Because human action videos contain a large number of frames, implemented models must minimize computation by reducing the number, size, and resolution of frames. We propose an improved method for detecting human actions in low-size and low-resolution videos by employing convolutional neural networks (CNNs) with channel attention mechanisms (CAMs) and autoencoders (AEs). By enhancing blocks with more representative features, convolutional layers extract discriminating features from various networks. Additionally, we use random sampling of frames before main processing to improve accuracy while employing less data. The goal is to increase performance while overcoming challenges such as overfitting, computational complexity, and uncertainty by utilizing CNN-CAM and AE. Identifying patterns and features associated with selective high-level performance is the next step. To validate the method, low-resolution and low-size video frames were used in the UCF50, UCF101, and HMDB51 datasets. Additionally, the algorithm has relatively minimal computational complexity. Consequently, the proposed method performs satisfactorily compared to other similar methods. It has accuracy estimates of 77.29, 98.87, and 97.16%, respectively, for HMDB51, UCF50, and UCF101 datasets. These results indicate that the method can effectively classify human actions. Furthermore, the proposed method can be used as a processing model for low-resolution and low-size video frames.

Publisher

Hindawi Limited

Subject

Artificial Intelligence,Human-Computer Interaction,Theoretical Computer Science,Software

Link

http://downloads.hindawi.com/journals/ijis/2024/1052344.pdf

Reference103 articles.

1. DCNN based human activity recognition framework with depth vision guiding

2. Toward human activity recognition: a survey

3. Pedestrian attribute recognition: A survey

4. Deep Learning for Sensor-based Human Activity Recognition

5. Human Action Recognition: A Taxonomy-Based Survey, Updates, and Opportunities