SelfTalk: A Self-Supervised Commutative Training Diagram to Comprehend 3D Talking Faces-Reference-Cited by-同舟云学术

SelfTalk: A Self-Supervised Commutative Training Diagram to Comprehend 3D Talking Faces

Published:2023-10-26 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the 31st ACM International Conference on Multimedia
language:
Short-container-title:

Author:

Peng Ziqiao¹^ORCID,Luo Yihao²^ORCID,Shi Yue³^ORCID,Xu Hao⁴^ORCID,Zhu Xiangyu⁵^ORCID,Liu Hongyan⁶^ORCID,He Jun¹^ORCID,Fan Zhaoxin¹^ORCID

Affiliation:

1. Renmin University of China, Beijing, China

2. Imperial College London, London, United Kingdom

3. Psyche AI Inc., Beijing, China

4. HKUST, Hong Kong, Hong Kong

5. Chinese Academy of Sciences, Beijing, China

6. Tsinghua University, Beijing, China

Funder

National Key Research and Development Program of China

Public Computing Cloud Renmin University of China

National Natural Science Foundation of China

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3581783.3611734

Reference66 articles.

1. Triantafyllos Afouras , Joon Son Chung, and Andrew Zisserman . 2018 . LRS 3-TED: a large-scale dataset for visual speech recognition. arXiv preprint arXiv:1809.00496 (2018). Triantafyllos Afouras, Joon Son Chung, and Andrew Zisserman. 2018. LRS3-TED: a large-scale dataset for visual speech recognition. arXiv preprint arXiv:1809.00496 (2018).

2. Yannis M Assael , Brendan Shillingford , Shimon Whiteson , and Nando De Freitas . 2016 . Lipnet: End-to-end sentence-level lipreading. arXiv preprint arXiv:1611.01599 (2016). Yannis M Assael, Brendan Shillingford, Shimon Whiteson, and Nando De Freitas. 2016. Lipnet: End-to-end sentence-level lipreading. arXiv preprint arXiv:1611.01599 (2016).

3. Alexei Baevski , Yuhao Zhou , Abdelrahman Mohamed , and Michael Auli . 2020. wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in neural information processing systems 33 ( 2020 ), 12449--12460. Alexei Baevski, Yuhao Zhou, Abdelrahman Mohamed, and Michael Auli. 2020. wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in neural information processing systems 33 (2020), 12449--12460.

4. Linchao Bao , Haoxian Zhang , Yue Qian , Tangli Xue , Changhai Chen , Xuefei Zhe , and Di Kang . 2023. Learning Audio-Driven Viseme Dynamics for 3D Face Animation. arXiv preprint arXiv:2301.06059 ( 2023 ). Linchao Bao, Haoxian Zhang, Yue Qian, Tangli Xue, Changhai Chen, Xuefei Zhe, and Di Kang. 2023. Learning Audio-Driven Viseme Dynamics for 3D Face Animation. arXiv preprint arXiv:2301.06059 (2023).

5. Authentic volumetric avatars from a phone scan;Cao Chen;ACM Transactions on Graphics (TOG),2022

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance;Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers '24;2024-07-13

2. CLTalk: Speech-Driven 3D Facial Animation with Contrastive Learning;Proceedings of the 2024 International Conference on Multimedia Retrieval;2024-05-30

3. NERF-AD: Neural Radiance Field With Attention-Based Disentanglement For Talking Face Synthesis;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14

4. Self-Diffuser: Research on the Technology of Speech-Driven Facial Expressions;Computer Science and Application;2024