Author:
Ding Yifan,Xu Yong,Zhang Shi-Xiong,Cong Yahuan,Wang Liqiang
Cited by
20 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Look, Listen and Recognise: Character-Aware Audio-Visual Subtitling;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14
2. CAD - Contextual Multi-modal Alignment for Dynamic AVQA;2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV);2024-01-03
3. Improving Audiovisual Active Speaker Detection in Egocentric Recordings with the Data-Efficient Image Transformer;2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU);2023-12-16
4. Uncertainty-Guided End-to-End Audio-Visual Speaker Diarization for Far-Field Recordings;Proceedings of the 31st ACM International Conference on Multimedia;2023-10-26
5. Hyperbolic Audio-visual Zero-shot Learning;2023 IEEE/CVF International Conference on Computer Vision (ICCV);2023-10-01