SCGFormer: Semantic Chebyshev Graph Convolution Transformer for 3D Human Pose Estimation-Reference-Cited by-同舟云学术

SCGFormer: Semantic Chebyshev Graph Convolution Transformer for 3D Human Pose Estimation

Published:2024-02-18 Issue:4 Volume:14 Page:1646
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Liang Jiayao¹,Yin Mengxiao¹²

Affiliation:

1. School of Computer and Electronic Information, Guangxi University, Nanning 530004, China

2. Guangxi Key Laboratory of Multimedia Communications and Network Technology, Nanning 530004, China

Abstract

With the rapid advancement of deep learning, 3D human pose estimation has largely freed itself from reliance on manually annotated methods. The effective utilization of joint features has become significant. Utilizing 2D human joint information to predict 3D human skeletons is of paramount importance. Effectively leveraging 2D joint data can improve the accuracy of 3D human skeleton prediction. In this paper, we propose the SCGFormer model to reduce the error in predicting human skeletal poses in three-dimensional space. The network architecture of SCGFormer encompasses Transformer and two distinct types of graph convolution, organized into two interconnected modules: SGraAttention and AcChebGconv. SGraAttention extracts global feature information from each 2D human joint, thereby augmenting local feature learning by integrating prior knowledge of human joint relationships. Simultaneously, AcChebGconv broadens the receptive field for graph structure information and constructs implicit joint relationships to aggregate more valuable adjacent features. SCGraFormer is tested on widely recognized benchmark datasets such as Human3.6M and MPI-INF-3DHP and achieves excellent results. In particular, on Human3.6M, our method achieves the best results in 9 actions (out of a total of 15 actions), with an overall average error reduction of about 1.5 points compared to state-of-the-art methods, demonstrating the excellent performance of SCGFormer.

Funder

Natural Science Foundation of China

Publisher

MDPI AG

Link

https://www.mdpi.com/2076-3417/14/4/1646/pdf

Reference61 articles.

1. Staudemeyer, R.C., and Morris, E.R. (2019). Understanding LSTM—A tutorial into long short-term memory recurrent neural networks. arXiv.

2. Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv.

3. Bruna, J., Zaremba, W., Szlam, A., and LeCun, Y. (2013). Spectral networks and locally connected networks on graphs. arXiv.

4. Deep learning on graphs: A survey;Zhang;IEEE Trans. Knowl. Data Eng.,2020

5. In-air handwritten English word recognition using attention recurrent translator;Gan;Neural Comput. Appl.,2019