Gesture Generation with Diffusion Models Aided by Speech Activity Information-Reference-Cited by-同舟云学术

Gesture Generation with Diffusion Models Aided by Speech Activity Information

Published:2023-10-09 Issue: Volume: Page:
ISSN:
Container-title:International Cconference on Multimodal Interaction
language:
Short-container-title:

Author:

Tonoli Rodolfo L.¹^ORCID,Marques Leonardo B. de M. M.²^ORCID,Ueda Lucas H.²^ORCID,Costa Paula Dornhofer Paro¹^ORCID

Affiliation:

1. Dept. of Computer Engineering and Automation, School of Electrical and Computer Engineering, University of Campinas (UNICAMP), Brazil and Artificial Intelligence Lab., Recod.ai, University of Campinas (UNICAMP), Brazil

2. CPQD, Brazil

Funder

Coordenação de Aperfeiçoamento de Pessoal de Nivel Superior ð Brasil (CAPES)

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3610661.3616554

Reference38 articles.

1. Simon Alexanderson , Rajmund Nagy , Jonas Beskow , and Gustav Eje Henter . 2022. Listen , denoise, action! audio-driven motion synthesis with diffusion models. arXiv preprint arXiv:2211.09707 ( 2022 ). Simon Alexanderson, Rajmund Nagy, Jonas Beskow, and Gustav Eje Henter. 2022. Listen, denoise, action! audio-driven motion synthesis with diffusion models. arXiv preprint arXiv:2211.09707 (2022).

2. Tenglong Ao , Zeyi Zhang , and Libin Liu . 2023. GestureDiffuCLIP: Gesture diffusion model with CLIP latents. arXiv preprint arXiv:2303.14613 ( 2023 ). Tenglong Ao, Zeyi Zhang, and Libin Liu. 2023. GestureDiffuCLIP: Gesture diffusion model with CLIP latents. arXiv preprint arXiv:2303.14613 (2023).

3. Rohan Badlani , Adrian Łańcucki , Kevin J Shih , Rafael Valle , Wei Ping , and Bryan Catanzaro . 2022 . One TTS alignment to rule them all . In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 6092–6096 . Rohan Badlani, Adrian Łańcucki, Kevin J Shih, Rafael Valle, Wei Ping, and Bryan Catanzaro. 2022. One TTS alignment to rule them all. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 6092–6096.

4. Multimodal Machine Learning: A Survey and Taxonomy

5. The IVI Lab entry to the GENEA Challenge 2022 – A Tacotron2 Based Method for Co-Speech Gesture Generation With Locality-Constraint Attention Mechanism