FLARE: Fast Learning of Animatable and Relightable Mesh Avatars

Author:

Bharadwaj Shrisha1ORCID,Zheng Yufeng2ORCID,Hilliges Otmar3ORCID,Black Michael J.1ORCID,Abrevaya Victoria Fernandez1ORCID

Affiliation:

1. Max Planck Institute for Intelligent Systems, Germany

2. ETH Zürich, Switzerland and Max Planck Institute for Intelligent Systems, Germany

3. ETH Zürich, Switzerland

Abstract

Our goal is to efficiently learn personalized animatable 3D head avatars from videos that are geometrically accurate, realistic, relightable, and compatible with current rendering systems. While 3D meshes enable efficient processing and are highly portable, they lack realism in terms of shape and appearance. Neural representations, on the other hand, are realistic but lack compatibility and are slow to train and render. Our key insight is that it is possible to efficiently learn high-fidelity 3D mesh representations via differentiable rendering by exploiting highly-optimized methods from traditional computer graphics and approximating some of the components with neural networks. To that end, we introduce FLARE, a technique that enables the creation of animatable and relightable mesh avatars from a single monocular video. First, we learn a canonical geometry using a mesh representation, enabling efficient differentiable rasterization and straightforward animation via learned blendshapes and linear blend skinning weights. Second, we follow physically-based rendering and factor observed colors into intrinsic albedo, roughness, and a neural representation of the illumination, allowing the learned avatars to be relit in novel scenes. Since our input videos are captured on a single device with a narrow field of view, modeling the surrounding environment light is non-trivial. Based on the split-sum approximation for modeling specular reflections, we address this by approximating the prefiltered environment map with a multi-layer perceptron (MLP) modulated by the surface roughness, eliminating the need to explicitly model the light. We demonstrate that our mesh-based avatar formulation, combined with learned deformation, material, and lighting MLPs, produces avatars with high-quality geometry and appearance, while also being efficient to train and render compared to existing approaches.

Funder

Max-Planck-Gesellschaft

Max Planck ETH Center for Learning Systems

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Graphics and Computer-Aided Design

Reference56 articles.

1. Victoria Fernandez Abrevaya , Adnane Boukhayma , Philip H.S. Torr , and Edmond Boyer . 2020 . Cross-Modal Deep Face Normals With Deactivable Skip Connections . In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Victoria Fernandez Abrevaya, Adnane Boukhayma, Philip H.S. Torr, and Edmond Boyer. 2020. Cross-Modal Deep Face Normals With Deactivable Skip Connections. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

2. A General and Adaptive Robust Loss Function

3. Thabo Beeler Fabian Hahn Derek Bradley Bernd Bickel Paul Beardsley Craig Gotsman Robert W Sumner and Markus Gross. 2011. High-Quality Passive Facial Performance Capture using Anchor Frames. In ACM Transactions on Graphics (Proc. SIGGRAPH). 1--10. Thabo Beeler Fabian Hahn Derek Bradley Bernd Bickel Paul Beardsley Craig Gotsman Robert W Sumner and Markus Gross. 2011. High-Quality Passive Facial Performance Capture using Anchor Frames. In ACM Transactions on Graphics (Proc. SIGGRAPH). 1--10.

4. Sai Bi , Zexiang Xu , Pratul Srinivasan , Ben Mildenhall , Kalyan Sunkavalli , Miloš Hašan , Yannick Hold-Geoffroy , David Kriegman , and Ravi Ramamoorthi . 2020. Neural Reflectance Fields for Appearance Acquisition. arXiv preprint arXiv:2008.03824 ( 2020 ). Sai Bi, Zexiang Xu, Pratul Srinivasan, Ben Mildenhall, Kalyan Sunkavalli, Miloš Hašan, Yannick Hold-Geoffroy, David Kriegman, and Ravi Ramamoorthi. 2020. Neural Reflectance Fields for Appearance Acquisition. arXiv preprint arXiv:2008.03824 (2020).

5. Volker Blanz and Thomas Vetter. 1999. A Morphable Model for the Synthesis of 3D Faces. In ACM Transactions on Graphics (Proc. SIGGRAPH). 187--194. Volker Blanz and Thomas Vetter. 1999. A Morphable Model for the Synthesis of 3D Faces. In ACM Transactions on Graphics (Proc. SIGGRAPH). 187--194.

Cited by 3 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. MonoGaussianAvatar: Monocular Gaussian Point-based Head Avatar;Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers '24;2024-07-13

2. Recent Trends in 3D Reconstruction of General Non‐Rigid Scenes;Computer Graphics Forum;2024-04-30

3. HQ3DAvatar: High-quality Implicit 3D Head Avatar;ACM Transactions on Graphics;2024-04-09

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3