Human action recognition with transformer based on convolutional features-Reference-Cited by-同舟云学术

Human action recognition with transformer based on convolutional features

Published:2024-06-07 Issue:2 Volume:18 Page:881-896
ISSN:1872-4981
Container-title:Intelligent Decision Technologies
language:
Short-container-title:IDT

Author:

Shi Chengcheng,Liu Shuxin

Abstract

As one of the key research directions in the field of computer vision, human action recognition has a wide range of practical application values and prospects. In the fields of video surveillance, human-computer interaction, sports analysis, and healthcare, human action recognition technology shows a broad application prospect and potential. However, the diversity and complexity of human actions bring many challenges, such as handling complex actions, distinguishing similar actions, coping with changes in viewing angle, and overcoming occlusion problems. To address the challenges, this paper proposes an innovative framework for human action recognition. The framework combines the latest pose estimation algorithms, pre-trained CNN models, and a Vision Transformer to build an efficient system. The first step involves utilizing the latest pose estimation algorithm to accurately extract human pose information from real RGB image frames. Then, a pre-trained CNN model is used to perform feature extraction on the extracted pose information. Finally, the Vision Transformer model is applied for fusion and classification operations on the extracted features. Experimental validation is conducted on two benchmark datasets, UCF 50 and UCF 101, to demonstrate the effectiveness and efficiency of the proposed framework. The applicability and limitations of the framework in different scenarios are further explored through quantitative and qualitative experiments, providing valuable insights and inspiration for future research.

Publisher

IOS Press

Reference21 articles.

1. Shedding light on people action recognition in social robotics by means of common spatial patterns;Rodríguez-Moreno;Sensors.,2020

2. Suspicious activity detection using deep learning in secure assisted living IoT environments;Vallathan;The Journal of Supercomputing.,2021

3. The security of vulnerable senior citizens through dynamically sensed signal acquisition;Wang;Transactions on Emerging Telecommunications Technologies.,2022

4. Driving behavior explanation with multi-level fusion;Ben-Younes;Pattern Recognition.,2022

5. Part-aligned pose-guided recurrent network for action recognition;Huang;Pattern Recognition.,2019

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Advancements in Real-Time Human Activity Recognition via Innovative Fusion of 3DCNN and ConvLSTM Models;Journal of Machine and Computing;2024-07-05