FLARE: Fast Learning of Animatable and Relightable Mesh Avatars-Reference-Cited by-同舟云学术

FLARE: Fast Learning of Animatable and Relightable Mesh Avatars

Published:2023-12-05 Issue:6 Volume:42 Page:1-15
ISSN:0730-0301
Container-title:ACM Transactions on Graphics
language:en
Short-container-title:ACM Trans. Graph.

Author:

Bharadwaj Shrisha¹^ORCID,Zheng Yufeng²^ORCID,Hilliges Otmar³^ORCID,Black Michael J.¹^ORCID,Abrevaya Victoria Fernandez¹^ORCID

Affiliation:

1. Max Planck Institute for Intelligent Systems, Germany

2. ETH Zürich, Switzerland and Max Planck Institute for Intelligent Systems, Germany

3. ETH Zürich, Switzerland

Abstract

Our goal is to efficiently learn personalized animatable 3D head avatars from videos that are geometrically accurate, realistic, relightable, and compatible with current rendering systems. While 3D meshes enable efficient processing and are highly portable, they lack realism in terms of shape and appearance. Neural representations, on the other hand, are realistic but lack compatibility and are slow to train and render. Our key insight is that it is possible to efficiently learn high-fidelity 3D mesh representations via differentiable rendering by exploiting highly-optimized methods from traditional computer graphics and approximating some of the components with neural networks. To that end, we introduce FLARE, a technique that enables the creation of animatable and relightable mesh avatars from a single monocular video. First, we learn a canonical geometry using a mesh representation, enabling efficient differentiable rasterization and straightforward animation via learned blendshapes and linear blend skinning weights. Second, we follow physically-based rendering and factor observed colors into intrinsic albedo, roughness, and a neural representation of the illumination, allowing the learned avatars to be relit in novel scenes. Since our input videos are captured on a single device with a narrow field of view, modeling the surrounding environment light is non-trivial. Based on the split-sum approximation for modeling specular reflections, we address this by approximating the prefiltered environment map with a multi-layer perceptron (MLP) modulated by the surface roughness, eliminating the need to explicitly model the light. We demonstrate that our mesh-based avatar formulation, combined with learned deformation, material, and lighting MLPs, produces avatars with high-quality geometry and appearance, while also being efficient to train and render compared to existing approaches.

Funder

Max-Planck-Gesellschaft

Max Planck ETH Center for Learning Systems

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Graphics and Computer-Aided Design

Link

https://dl.acm.org/doi/pdf/10.1145/3618401

Reference56 articles.

1. Victoria Fernandez Abrevaya , Adnane Boukhayma , Philip H.S. Torr , and Edmond Boyer . 2020 . Cross-Modal Deep Face Normals With Deactivable Skip Connections . In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Victoria Fernandez Abrevaya, Adnane Boukhayma, Philip H.S. Torr, and Edmond Boyer. 2020. Cross-Modal Deep Face Normals With Deactivable Skip Connections. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

2. A General and Adaptive Robust Loss Function

3. Thabo Beeler Fabian Hahn Derek Bradley Bernd Bickel Paul Beardsley Craig Gotsman Robert W Sumner and Markus Gross. 2011. High-Quality Passive Facial Performance Capture using Anchor Frames. In ACM Transactions on Graphics (Proc. SIGGRAPH). 1--10. Thabo Beeler Fabian Hahn Derek Bradley Bernd Bickel Paul Beardsley Craig Gotsman Robert W Sumner and Markus Gross. 2011. High-Quality Passive Facial Performance Capture using Anchor Frames. In ACM Transactions on Graphics (Proc. SIGGRAPH). 1--10.

4. Sai Bi , Zexiang Xu , Pratul Srinivasan , Ben Mildenhall , Kalyan Sunkavalli , Miloš Hašan , Yannick Hold-Geoffroy , David Kriegman , and Ravi Ramamoorthi . 2020. Neural Reflectance Fields for Appearance Acquisition. arXiv preprint arXiv:2008.03824 ( 2020 ). Sai Bi, Zexiang Xu, Pratul Srinivasan, Ben Mildenhall, Kalyan Sunkavalli, Miloš Hašan, Yannick Hold-Geoffroy, David Kriegman, and Ravi Ramamoorthi. 2020. Neural Reflectance Fields for Appearance Acquisition. arXiv preprint arXiv:2008.03824 (2020).

5. Volker Blanz and Thomas Vetter. 1999. A Morphable Model for the Synthesis of 3D Faces. In ACM Transactions on Graphics (Proc. SIGGRAPH). 187--194. Volker Blanz and Thomas Vetter. 1999. A Morphable Model for the Synthesis of 3D Faces. In ACM Transactions on Graphics (Proc. SIGGRAPH). 187--194.

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. MonoGaussianAvatar: Monocular Gaussian Point-based Head Avatar;Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers '24;2024-07-13

2. Recent Trends in 3D Reconstruction of General Non‐Rigid Scenes;Computer Graphics Forum;2024-04-30

3. HQ3DAvatar: High-quality Implicit 3D Head Avatar;ACM Transactions on Graphics;2024-04-09