Abstract
In this work, we tackle the problem of estimating 3D human pose in camera space from a monocular image. First, we propose to use densely-generated limb depth maps to ease the learning of body joints depth, which are well aligned with image cues. Then, we design a lifting module from 2D pixel coordinates to 3D camera coordinates which explicitly takes the depth values as inputs, and is aligned with camera perspective projection model. We show our method achieves superior performance on large-scale 3D pose datasets Human3.6M and MPI-INF-3DHP, and sets the new state-of-the-art.
Publisher
Association for the Advancement of Artificial Intelligence (AAAI)
Cited by
16 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. HF-HRNet: A Simple Hardware Friendly High-Resolution Network;IEEE Transactions on Circuits and Systems for Video Technology;2024-08
2. Updating Depth-aware Feature in the Feedback Loop for Human Mesh Recovery;2024 International Joint Conference on Neural Networks (IJCNN);2024-06-30
3. 3D pose estimation using joint-based calibration in distributed RGB-D camera system;Computers & Graphics;2024-05
4. Video 3D Human Pose Estimation Guided by Action Category Feature;2024 5th International Conference on Computer Vision, Image and Deep Learning (CVIDL);2024-04-19
5. L-HRNet: A Lightweight High-Resolution Network for Human Pose Estimation;2023 8th International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS);2023-11-23