Author:
Michelsanti Daniel,Slizovskaia Olga,Haro Gloria,Gómez Emilia,Tan Zheng-Hua,Jensen Jesper
Cited by
22 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Self-Supervised Adaptive AV Fusion Module for Pre-Trained ASR Models;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14
2. Large-Scale Unsupervised Audio Pre-Training for Video-to-Speech Synthesis;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2024
3. RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations;2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC);2023-10-31
4. Facetron: A Multi-Speaker Face-to-Speech Model Based on Cross-Modal Latent Representations;2023 31st European Signal Processing Conference (EUSIPCO);2023-09-04
5. Analyzing lower half facial gestures for lip reading applications: Survey on vision techniques;Computer Vision and Image Understanding;2023-08