Abstract
Digital human has been increasingly used in industry, for example in Metaverse which has been a popular topic in recent years. The existing method of obtaining digital human models are either expensive or lack of accuracy. In this paper, we discuss a novel method to reconstruct a 3D human model from 2D images captured by a monocular camera. The input of our method only requires a set of rotated human body images that can accept slight movement. First, we apply a deep learning method to predict an initial 3D human body model from multi-view human body images. Then the total detailed digital human model will be computed and optimized. The typical method requires the human body and cameras fixed to obtain a visual hull from a significant number of camera images. This could be extremely expensive and inconvenient when such an application is developed for online users. Compared to the structural lighting measurement system, our predict-optimized framework only requires several input images captured by personal equipment to provide enough accuracy and online use resolution results.
Reference40 articles.
1. Fuhrmann S., Langguth F., Goesele M., Mve-a multiview reconstruction environment, Eurograph. Workshops Graph. Cult. Herit. 11 – 18 (2014)
2. Newcombe R.A., Lovegrove S.J., Davison A.J., Dtam: Dense tracking and mapping in real-time, IEEE Int. Conf. Comput. Vis. 2320–2327 (2011)
3. Xu Y., Liu X., Qin L., Zhu S.-C., Multi-view people tracking via hierarchical trajectory composition, AAAI Conf. Artif. Intell. 1, (2017)
4. Joo H., Simon T., Sheikh Y., Total capture: a 3D deformation model for tracking faces, hands, and bodies, Comput. Vis. Pattern Recognit. (CVPR) 8320–8329 (2018)
5. SMPL