Skeleton Action Recognition Based on Temporal Gated Unit and Adaptive Graph Convolution-Reference-Cited by-同舟云学术

Skeleton Action Recognition Based on Temporal Gated Unit and Adaptive Graph Convolution

Published:2022-09-19 Issue:18 Volume:11 Page:2973
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Zhu Qilin^ORCID,Deng Hongmin,Wang Kaixuan

Abstract

In recent years, great progress has been made in the recognition of skeletal behaviors based on graph convolutional networks (GCNs). In most existing methods, however, the fixed adjacency matrix and fixed graph structure are used for skeleton data feature extraction in the spatial dimension, which usually leads to weak spatial modeling ability, unsatisfactory generalization performance, and an excessive number of model parameters. Most of these methods follow the ST-GCN approach in the temporal dimension, which inevitably leads to a number of non-key frames, increasing the cost of feature extraction and causing the model to be slower in terms of feature extraction and the required computational burden. In this paper, a gated temporally and spatially adaptive graph convolutional network is proposed. On the one hand, a learnable parameter matrix which can adaptively learn the key information of the skeleton data in spatial dimension is added to the graph convolution layer, improving the feature extraction and generalizability of the model and reducing the number of parameters. On the other hand, a gated unit is added to the temporal feature extraction module to alleviate interference from non-critical frames and reduce computational complexity. A channel attention mechanism based on an SE module and a frame attention mechanism are used to enhance the model’s feature extraction ability. To prevent model degradation and ensure more stable training, residual links are added to each feature extraction module. The proposed approach was ultimately able to achieve 0.63% higher accuracy on the X-Sub benchmark with 4.46 M fewer parameters than GAT, one of the best SOTA methods. Inference speed of our model reaches as fast as 86.23 sequences/(second × GPU). Extensive experimental results further validate the effectiveness of our proposed approach on three large-scale datasets, namely, NTU RGB+D 60, NTU RGB+D 120, and Kinetics Skeleton.

Funder

Natural Science Foundation of Sichuan Province

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Link

https://www.mdpi.com/2079-9292/11/18/2973/pdf

Reference42 articles.

1. A Survey on Visual Surveillance of Object Motion and Behaviors

2. Human activity analysis

3. Actional-structural graph convolutional networks for skeleton-based action recognition;Li;Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2019

4. Interpretable 3d human action analysis with temporal convolutional networks;Kim;Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2017

5. Multiview-Based 3-D Action Recognition Using Deep Networks

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Human risky behaviour recognition during ladder climbing based on multi-modal feature fusion and adaptive graph convolutional network;Signal, Image and Video Processing;2024-01-18

2. Temporal Enhancement Spatial-Temporal Graph Convolutional Networks;2023 5th International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI);2023-12-15

3. 2s-GATCN: Two-Stream Graph Attentional Convolutional Networks for Skeleton-Based Action Recognition;Electronics;2023-04-04

4. A New Partitioned Spatial–Temporal Graph Attention Convolution Network for Human Motion Recognition;Applied Sciences;2023-01-28

5. A Transformer-Based Unsupervised Domain Adaptation Method for Skeleton Behavior Recognition;IEEE Access;2023