AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model-Reference-Cited by-同舟云学术

AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model

Published:2024 Issue: Volume:26 Page:6462-6474
ISSN:1520-9210
Container-title:IEEE Transactions on Multimedia
language:
Short-container-title:IEEE Trans. Multimedia

Author:

Yeo Jeong Hun¹^ORCID,Kim Minsu¹^ORCID,Choi Jeongsoo¹^ORCID,Kim Dae Hoe²^ORCID,Ro Yong Man¹^ORCID

Affiliation:

1. Image and Video Systems Laboratory, School of Electrical Engineering, Korea Advanced Institue of Science and Technology (KAIST), Daejeon, South Korea

2. Visual Intelligence Research Section, Superintelligence Creative Research Laboratory, Electronics and Telecommunications Research Institute (ETRI), Daejeon, South Korea

Funder

IITP

National Research Foundation of Korea

BK21 FOUR

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Link

http://xplorestaging.ieee.org/ielx7/6046/10384483/10387745.pdf?arnumber=10387745

Reference91 articles.

1. 3D Convolutional Neural Networks for Human Action Recognition

2. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches

3. Visual Speech Recognition in a Driver Assistance System

4. Importance-Aware Information Bottleneck Learning Paradigm for Lip Reading

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Automatic speech recognition using advanced deep learning approaches: A survey;Information Fusion;2024-09

2. Visual Speech Recognition for Languages with Limited Labeled Data Using Automatic Labels from Whisper;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14

3. Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge;2023 IEEE/CVF International Conference on Computer Vision (ICCV);2023-10-01