1. Mesonet: a compact facial video forgery detection network;Afchar,2018
2. Self-supervised learning of audio-visual objects from video;Afouras,2020
3. Self-supervised multimodal versatile networks;Alayrac;Neural Inf. Process. Syst.,2020
4. Speech driven video editing via an audio-conditioned diffusion model;Bigioi;Image Vis. Comput.,2024
5. Modeling consonant-vowel coarticulation for articulatory speech synthesis;Birkholz;PLoS One,2013