3D CNN hand pose estimation with end-to-end hierarchical model and physical constraints from depth images
-
Published:2023
Issue:1
Volume:33
Page:35-48
-
ISSN:2336-4335
-
Container-title:Neural Network World
-
language:
-
Short-container-title:NNW
Author:
Xu Zhengze,Zhang Wenjun
Abstract
Previous studies are mainly focused on the works that depth image is treated as flat image, and then depth data tends to be mapped as gray values during the convolution processing and features extraction. To address this issue, an approach of 3D CNN hand pose estimation with end-to-end hierarchical model and physical constraints is proposed. After reconstruction of 3D space structure of hand from depth image, 3D model is converted into voxel grid for further hand pose estimation by 3D CNN. The 3D CNN method makes improvements by embedding end-to-end hierarchical model and constraints algorithm into the networks, resulting to train at fast convergence rate and avoid unrealistic hand pose. According to the experimental results, it reaches 87.98% of mean accuracy and 8.82 mm of mean absolute error (MAE) for all 21 joints within 24 ms at the inference time, which consistently outperforms several well-known gesture recognition algorithms.
Publisher
Czech Technical University in Prague - Central Library
Subject
Artificial Intelligence,Hardware and Architecture,General Neuroscience,Software