PA-Tran: Learning to Estimate 3D Hand Pose with Partial Annotation
Author:
Yu Tianze1, Bidulka Luke1ORCID, McKeown Martin J.2ORCID, Wang Z. Jane1
Affiliation:
1. Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC V6T 1Z4, Canada 2. Faculty of Medicine, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
Abstract
This paper tackles a novel and challenging problem—3D hand pose estimation (HPE) from a single RGB image using partial annotation. Most HPE methods ignore the fact that the keypoints could be partially visible (e.g., under occlusions). In contrast, we propose a deep-learning framework, PA-Tran, that jointly estimates the keypoints status and 3D hand pose from a single RGB image with two dependent branches. The regression branch consists of a Transformer encoder which is trained to predict a set of target keypoints, given an input set of status, position, and visual features embedding from a convolutional neural network (CNN); the classification branch adopts a CNN for estimating the keypoints status. One key idea of PA-Tran is a selective mask training (SMT) objective that uses a binary encoding scheme to represent the status of the keypoints as observed or unobserved during training. In addition, by explicitly encoding the label status (observed/unobserved), the proposed PA-Tran can efficiently handle the condition when only partial annotation is available. Investigating the annotation percentage ranging from 50–100%, we show that training with partial annotation is more efficient (e.g., achieving the best 6.0 PA-MPJPE when using about 85% annotations). Moreover, we provide two new datasets. APDM-Hand, is for synthetic hands with APDM sensor accessories, which is designed for a specific hand task. PD-APDM-Hand, is a real hand dataset collected from Parkinson’s Disease (PD) patients with partial annotation. The proposed PA-Tran can achieve higher estimation accuracy when evaluated on both proposed datasets and a more general hand dataset.
Funder
Natural Sciences and Engineering Research Council of Canada Canadian Institutes of Health Research
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference57 articles.
1. Chatzis, T., Stergioulas, A., Konstantinidis, D., Dimitropoulos, K., and Daras, P. (2020). A Comprehensive Study on Deep Learning-Based 3D Hand Pose Estimation Methods. Appl. Sci., 10. 2. Hand posture and gesture recognition techniques for virtual reality applications: A survey;Sagayam;Virtual Real.,2017 3. Meier, M., Streli, P., Fender, A., and Holz, C. (April, January 27). TapID: Rapid touch interaction in virtual reality using wearable sensing. Proceedings of the 2021 IEEE Virtual Reality and 3D User Interfaces (VR), Lisboa, Portugal. 4. Noreen, I., Hamid, M., Akram, U., Malik, S., and Saleem, M. (2021). Hand pose recognition using parallel multi stream CNN. Sensors, 21. 5. Human-machine interaction sensing technology based on hand gesture recognition: A review;Guo;IEEE Trans. Hum.-Mach. Syst.,2021
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|