1. Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion Models
2. Mónica Villanueva Aylagas , Héctor Anadon Leon , Mattias Teye , and Konrad Tollmar . 2022. Voice2Face: Audio-driven Facial and Tongue Rig Animations with cVAEs . In EUROGRAPHICS SYMPOSIUM ON COMPUTER ANIMATION (SCA 2022 . Mónica Villanueva Aylagas, Héctor Anadon Leon, Mattias Teye, and Konrad Tollmar. 2022. Voice2Face: Audio-driven Facial and Tongue Rig Animations with cVAEs. In EUROGRAPHICS SYMPOSIUM ON COMPUTER ANIMATION (SCA 2022.
3. Alexei Baevski , Yuhao Zhou , Abdelrahman Mohamed , and Michael Auli . 2020. wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in neural information processing systems 33 ( 2020 ), 12449–12460. Alexei Baevski, Yuhao Zhou, Abdelrahman Mohamed, and Michael Auli. 2020. wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in neural information processing systems 33 (2020), 12449–12460.
4. Dan Bigioi , Shubhajit Basak , Hugh Jordan , Rachel McDonnell , and Peter Corcoran . 2023. Speech Driven Video Editing via an Audio-Conditioned Diffusion Model. arXiv preprint arXiv:2301.04474 ( 2023 ). Dan Bigioi, Shubhajit Basak, Hugh Jordan, Rachel McDonnell, and Peter Corcoran. 2023. Speech Driven Video Editing via an Audio-Conditioned Diffusion Model. arXiv preprint arXiv:2301.04474 (2023).
5. Expressive speech-driven facial animation