Synthesizing Human Activity for Data Generation
-
Published:2023-09-29
Issue:10
Volume:9
Page:204
-
ISSN:2313-433X
-
Container-title:Journal of Imaging
-
language:en
-
Short-container-title:J. Imaging
Author:
Romero Ana1, Carvalho Pedro23, Côrte-Real Luís12, Pereira Américo12ORCID
Affiliation:
1. Faculdade de Engenharia, Universidade do Porto, 4200-465 Porto, Portugal 2. Instituto de Engenharia de Sistemas e Computadores, Tecnologia e Ciência, 4200-465 Porto, Portugal 3. School of Engineering, Polytechnic of Porto, 4200-072 Porto, Portugal
Abstract
The problem of gathering sufficiently representative data, such as those about human actions, shapes, and facial expressions, is costly and time-consuming and also requires training robust models. This has led to the creation of techniques such as transfer learning or data augmentation. However, these are often insufficient. To address this, we propose a semi-automated mechanism that allows the generation and editing of visual scenes with synthetic humans performing various actions, with features such as background modification and manual adjustments of the 3D avatars to allow users to create data with greater variability. We also propose an evaluation methodology for assessing the results obtained using our method, which is two-fold: (i) the usage of an action classifier on the output data resulting from the mechanism and (ii) the generation of masks of the avatars and the actors to compare them through segmentation. The avatars were robust to occlusion, and their actions were recognizable and accurate to their respective input actors. The results also showed that even though the action classifier concentrates on the pose and movement of the synthetic humans, it strongly depends on contextual information to precisely recognize the actions. Generating the avatars for complex activities also proved problematic for action recognition and the clean and precise formation of the masks.
Funder
European Union’s Horizon Europe research and innovation programme Fundação para a Ciência e Tecnologia
Subject
Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Vision and Pattern Recognition,Radiology, Nuclear Medicine and imaging
Reference59 articles.
1. Nie, B.X., Wei, P., and Zhu, S.C. (2017, January 22–29). Monocular 3D Human Pose Estimation by Predicting Depth on Joints. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy. 2. Tian, Y., Zhang, H., Liu, Y., and Wang, L. (2022). Recovering 3D Human Mesh from Monocular Images: A Survey. arXiv. 3. Pavlakos, G., Choutas, V., Ghorbani, N., Bolkart, T., Osman, A.A.A., Tzionas, D., and Black, M.J. (2019, January 15–20). Expressive Body Capture: 3D Hands, Face, and Body from a Single Image. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA. 4. SMPL: A Skinned Multi-Person Linear Model;Loper;ACM Trans. Graph.,2015 5. Embodied Hands: Modeling and Capturing Hands and Bodies Together;Romero;ACM Trans. Graph.,2017
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|