Author:
Chen Hang,Du Jun,Dai Yusheng,Lee Chin-Hui,Siniscalchi Sabato Marco,Watanabe Shinji,Scharenborg Odette,Chen Jingdong,Yin Baocai,Pan Jia
Cited by
17 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder;2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW);2024-07-15
2. Hourglass-AVSR: Down-Up Sampling-Based Computational Efficiency Model for Audio-Visual Speech Recognition;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14
3. The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14
4. OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14
5. MLCA-AVSR: Multi-Layer Cross Attention Fusion Based Audio-Visual Speech Recognition;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14