Enhanced 3D Pose Estimation in Multi-Person, Multi-View Scenarios through Unsupervised Domain Adaptation with Dropout Discriminator
Author:
Deng Junli1, Yao Haoyuan1, Shi Ping1
Affiliation:
1. School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
Abstract
Data-driven pose estimation methods often assume equal distributions between training and test data. However, in reality, this assumption does not always hold true, leading to significant performance degradation due to distribution mismatches. In this study, our objective is to enhance the cross-domain robustness of multi-view, multi-person 3D pose estimation. We tackle the domain shift challenge through three key approaches: (1) A domain adaptation component is introduced to improve estimation accuracy for specific target domains. (2) By incorporating a dropout mechanism, we train a more reliable model tailored to the target domain. (3) Transferable Parameter Learning is employed to retain crucial parameters for learning domain-invariant data. The foundation for these approaches lies in the H-divergence theory and the lottery ticket hypothesis, which are realized through adversarial training by learning domain classifiers. Our proposed methodology is evaluated using three datasets: Panoptic, Shelf, and Campus, allowing us to assess its efficacy in addressing domain shifts in multi-view, multi-person pose estimation. Both qualitative and quantitative experiments demonstrate that our algorithm performs well in two different domain shift scenarios.
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference66 articles.
1. Human pose estimation and its application to action recognition: A survey;Song;J. Vis. Commun. Image Represent.,2021 2. Driving-signal aware full-body avatars;Bagautdinov;ACM Trans. Graph.,2021 3. Wang, J., Yan, S., Dai, B., and Lin, D. (2021, January 20–25). Scene-aware generative network for human motion synthesis. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA. 4. Moon, G., Chang, J.Y., and Lee, K.M. (November, January 27). Camera distance-aware top-down approach for 3d multi-person pose estimation from a single rgb image. Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea. 5. Zeng, A., Ju, X., Yang, L., Gao, R., Zhu, X., Dai, B., and Xu, Q. (2022). Deciwatch: A simple baseline for 10x efficient 2d and 3d pose estimation. arXiv.
|
|