Bidirectional temporal feature for <scp>3D</scp> human pose and shape estimation from a video-Reference-Cited by-同舟云学术

Bidirectional temporal feature for 3D human pose and shape estimation from a video

Published:2023-05 Issue:3-4 Volume:34 Page:
ISSN:1546-4261
Container-title:Computer Animation and Virtual Worlds
language:en
Short-container-title:Computer Animation & Virtual

Author:

Sun Libo¹^ORCID,Tang Ting¹^ORCID,Qu Yuke¹^ORCID,Qin Wenhu¹^ORCID

Affiliation:

1. School of Instrument Science and Engineering Southeast University Nanjing China

Abstract

Abstract3D human pose and shape estimation is the foundation of analyzing human motion. However, estimating accurate and temporally consistent 3D human motion from a video remains a challenge. By now, most of the video‐based methods for estimating 3D human pose and shape rely on unidirectional temporal features and lack more comprehensive information. To solve this problem, we propose a novel model “bidirectional temporal feature for human motion recovery” (BTMR), which consists of a human motion generator and a discriminator. The transformer‐based generator effectively captures the forward and reverse temporal features to enhance the temporal correlation between frames and reduces the loss of spatial information. The motion discriminator based on Bi‐LSTM can distinguish whether the generated pose sequences are consistent with the realistic sequences of the AMASS dataset. In the process of continuous generation and discrimination, the model can learn more realistic and accurate poses. We evaluate our BTMR on 3DPW and MPI‐INF‐3DHP datasets. Without the training set of 3DPW, BTMR outperforms VIBE by 2.4 mm and 14.9 mm/s2 in PA‐MPJPE and Accel metrics and outperforms TCMR by 1.7 mm in PA‐MPJPE metric on 3DPW. The results demonstrate that our BTMR produces better accurate and temporal consistent 3D human motion.

Funder

National Key Research and Development Program of China

Jiangsu Provincial Key Research and Development Program

Publisher

Wiley

Subject

Computer Graphics and Computer-Aided Design,Software

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1002/cav.2187

Reference36 articles.

1. PaMIR: Parametric Model-Conditioned Implicit Representation for Image-based Human Reconstruction

2. HuangZ XuY LassnerC LiH TungT.ARCH: animatable reconstruction of clothed humans. In: Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pp. 3090–3099. IEEE New York (2020).