Affiliation:
1. Max Planck Institute for Intelligent Systems, Germany
2. ETH Zürich, Switzerland and Max Planck Institute for Intelligent Systems, Germany
3. ETH Zürich, Switzerland
Abstract
Our goal is to efficiently learn personalized animatable 3D head avatars from videos that are geometrically accurate, realistic, relightable, and compatible with current rendering systems. While 3D meshes enable efficient processing and are highly portable, they lack realism in terms of shape and appearance. Neural representations, on the other hand, are realistic but lack compatibility and are slow to train and render. Our key insight is that it is possible to efficiently learn high-fidelity 3D mesh representations via differentiable rendering by exploiting highly-optimized methods from traditional computer graphics and approximating some of the components with neural networks. To that end, we introduce FLARE, a technique that enables the creation of animatable and relightable mesh avatars from a single monocular video. First, we learn a canonical geometry using a mesh representation, enabling efficient differentiable rasterization and straightforward animation via learned blendshapes and linear blend skinning weights. Second, we follow physically-based rendering and factor observed colors into intrinsic albedo, roughness, and a neural representation of the illumination, allowing the learned avatars to be relit in novel scenes. Since our input videos are captured on a single device with a narrow field of view, modeling the surrounding environment light is non-trivial. Based on the split-sum approximation for modeling specular reflections, we address this by approximating the prefiltered environment map with a multi-layer perceptron (MLP) modulated by the surface roughness, eliminating the need to explicitly model the light. We demonstrate that our mesh-based avatar formulation, combined with learned deformation, material, and lighting MLPs, produces avatars with high-quality geometry and appearance, while also being efficient to train and render compared to existing approaches.
Funder
Max-Planck-Gesellschaft
Max Planck ETH Center for Learning Systems
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design
Reference56 articles.
1. Victoria Fernandez Abrevaya , Adnane Boukhayma , Philip H.S. Torr , and Edmond Boyer . 2020 . Cross-Modal Deep Face Normals With Deactivable Skip Connections . In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Victoria Fernandez Abrevaya, Adnane Boukhayma, Philip H.S. Torr, and Edmond Boyer. 2020. Cross-Modal Deep Face Normals With Deactivable Skip Connections. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
2. A General and Adaptive Robust Loss Function
3. Thabo Beeler Fabian Hahn Derek Bradley Bernd Bickel Paul Beardsley Craig Gotsman Robert W Sumner and Markus Gross. 2011. High-Quality Passive Facial Performance Capture using Anchor Frames. In ACM Transactions on Graphics (Proc. SIGGRAPH). 1--10. Thabo Beeler Fabian Hahn Derek Bradley Bernd Bickel Paul Beardsley Craig Gotsman Robert W Sumner and Markus Gross. 2011. High-Quality Passive Facial Performance Capture using Anchor Frames. In ACM Transactions on Graphics (Proc. SIGGRAPH). 1--10.
4. Sai Bi , Zexiang Xu , Pratul Srinivasan , Ben Mildenhall , Kalyan Sunkavalli , Miloš Hašan , Yannick Hold-Geoffroy , David Kriegman , and Ravi Ramamoorthi . 2020. Neural Reflectance Fields for Appearance Acquisition. arXiv preprint arXiv:2008.03824 ( 2020 ). Sai Bi, Zexiang Xu, Pratul Srinivasan, Ben Mildenhall, Kalyan Sunkavalli, Miloš Hašan, Yannick Hold-Geoffroy, David Kriegman, and Ravi Ramamoorthi. 2020. Neural Reflectance Fields for Appearance Acquisition. arXiv preprint arXiv:2008.03824 (2020).
5. Volker Blanz and Thomas Vetter. 1999. A Morphable Model for the Synthesis of 3D Faces. In ACM Transactions on Graphics (Proc. SIGGRAPH). 187--194. Volker Blanz and Thomas Vetter. 1999. A Morphable Model for the Synthesis of 3D Faces. In ACM Transactions on Graphics (Proc. SIGGRAPH). 187--194.
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. MonoGaussianAvatar: Monocular Gaussian Point-based Head Avatar;Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers '24;2024-07-13
2. Recent Trends in 3D Reconstruction of General Non‐Rigid Scenes;Computer Graphics Forum;2024-04-30
3. HQ3DAvatar: High-quality Implicit 3D Head Avatar;ACM Transactions on Graphics;2024-04-09