Abstract
AbstractHuman pose estimation is an important task in computer vision, which can provide key point detection of human body and obtain bone information. At present, human pose estimation is mainly utilized for detection of large targets, and there is no solution for detection of small targets. This paper proposes a multi-channel spatial information feature based human pose (MCSF-Pose) estimation algorithm to address the issue of medium and small targets inaccurate detection of human key points in scenarios involving occlusion and multiple poses. The MCSF-Pose network is a bottom-up regression network. Firstly, an UP-Focus module is designed to expand the feature information while reducing parameter computation during the up-sampling process. Then, the channel segmentation strategy is adopted to cut the features, and the feature information of multiple dimensions is retained through different convolutional groups, which reduces the parameter lightweight network model and makes up for the loss of the feature information associated with the depth of the network. Finally, the three-layer PANet structure is designed to reduce the complexity of the model. With the aid of the structure, it also to improve the detection accuracy and anti-interference ability of human key points. The experimental results indicate that the proposed algorithm outperforms YOLO-Pose and other human pose estimation algorithms on COCO2017 and MPII human pose datasets.
Funder
Liaoning Provincial Science and Technology Plan Project
Shenyang Science and Technology Plan Project
Publisher
Springer Science and Business Media LLC