Affiliation:
1. School of Microelectronicss, Tianjin University, Tianjin 300072, China
Abstract
Accurately estimating human pose is crucial for providing feedback during exercises or musical performances, but the complex and flexible nature of human joints makes it challenging. Additionally, traditional methods often neglect pixel coordinates, which are naturally present in high-resolution images of the human body. To address this issue, we propose a novel human pose estimation method that directly incorporates pixel coordinates. Our method adds a coordinate channel to the convolution process and embeds pixel coordinates into the feature map, while also using coordinate attention to capture position- and structure-sensitive features. We further reduce the network parameters and computational cost by using small-scale convolution kernels and a smooth activation function in residual blocks. We evaluate our model on the MPII Human Pose and COCO Keypoint Detection datasets and demonstrate improved accuracy, highlighting the importance of directly incorporating coordinate location information in position-sensitive tasks.
Funder
National Natural Science Foundation of China
Natural Science Foundation of Tianjin, China
Tianjin University Innovation Foundation
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference48 articles.
1. Liu, W., Bao, Q., Sun, Y., and Mei, T. (2021). Recent Advances in Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective. arXiv.
2. Human pose estimation and its application to action recognition: A survey;Song;J. Vis. Commun. Image Represent.,2021
3. Dalal, N., and Triggs, B. (2005, January 20–25). Histograms of oriented gradients for human detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Diego, CA, USA.
4. Pictorial Structures for Object Recognition;Felzenszwalb;Int. J. Comput. Vis.,2005
5. SimpleCut: A simple and strong 2D model for multi-person pose estimation;Munea;Comput. Vis. Image Underst.,2022