Visual Speech Recognition by Lip Reading Using Deep Learning-Reference-Cited by-同舟云学术

Visual Speech Recognition by Lip Reading Using Deep Learning

Published:2024-03-29 Issue: Volume: Page:290-310
ISSN:2327-3453
Container-title:Advances in Systems Analysis, Software Engineering, and High Performance Computing
language:
Short-container-title:

Author:

Prakash V.¹,Bhavani R.¹^ORCID,Karthik Durga¹^ORCID,Rajalakshmi D.¹,Rajeswari N.¹,Martinaa M.¹

Affiliation:

1. SASTRA University, India

Abstract

By using image processing techniques, visual voice recognition (VSR) is able to extract voice or textual data from facial features. Similar to speech recognition systems, lip reading (LR) systems encounter issues because of variations in facial characteristics, speaking rates, skin tones, and pronunciations. An audio speech recognition system can be synchronised with the LR systems. The lip movement data, also known as lip characteristics or visemes, were obtained from the input video clip that was saved in the cloud. It takes each frame's lip features and stores them. Furthermore, training using a varied number of frames prevents a training dataset from yielding suitable text matches. Two parts make up the system: a feature extraction approach that turns lip characteristics into a visual feature cube and a Conv3D algorithm that matches words to their associated visemes. Precision is found in around 89% of the words. As a result, the 3D-CNN for the MIRACL-VC1 dataset performs better and offers increased classification accuracy when compared to the prior system.

Publisher

IGI Global

Reference17 articles.

1. Improved speaker independent lip reading using speaker adaptive training and deep neural networks

2. Classification of visemes using visual cues;N.Alothmany;Proceedings ELMAR-2010,2010

3. Efficient DNN Model for Word Lip-Reading

4. A Novel Frame Structure for Cloud-Based Audio-Visual Speech Enhancement in Multimodal Hearing-aids

5. Recognition of isolated words using Zernike and MFCC features for audio visual speech recognition

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Script Generation for Silent Speech in E-Learning;Advances in Educational Technologies and Instructional Design;2024-06-03