Abstract
Digital human has been increasingly used in industry, for example in Metaverse which has been a popular topic in recent years. The existing method of obtaining digital human models are either expensive or lack of accuracy. In this paper, we discuss a novel method to reconstruct a 3D human model from 2D images captured by a monocular camera. The input of our method only requires a set of rotated human body images that can accept slight movement. First, we apply a deep learning method to predict an initial 3D human body model from multi-view human body images. Then the total detailed digital human model will be computed and optimized. The typical method requires the human body and cameras fixed to obtain a visual hull from a significant number of camera images. This could be extremely expensive and inconvenient when such an application is developed for online users. Compared to the structural lighting measurement system, our predict-optimized framework only requires several input images captured by personal equipment to provide enough accuracy and online use resolution results.
Subject
Safety, Risk, Reliability and Quality
Reference40 articles.
1. Fuhrmann S., Langguth F., Goesele M., Mve-a multiview reconstruction environment, Eurograph. Workshops Graph. Cult. Herit. 11 – 18 (2014)
2. Newcombe R.A., Lovegrove S.J., Davison A.J., Dtam: Dense tracking and mapping in real-time, IEEE Int. Conf. Comput. Vis. 2320–2327 (2011)
3. Xu Y., Liu X., Qin L., Zhu S.-C., Multi-view people tracking via hierarchical trajectory composition, AAAI Conf. Artif. Intell. 1, (2017)
4. Joo H., Simon T., Sheikh Y., Total capture: a 3D deformation model for tracking faces, hands, and bodies, Comput. Vis. Pattern Recognit. (CVPR) 8320–8329 (2018)
5. SMPL