Author:
Patil Mahesh S,Chickerur Satyadhyan,Meti Anand,Nabapure Priyanka M,Mahindrakar Sunaina,Naik Sonali,Kanyal Soumya
Abstract
Speech Communication in a noisy environment is a difficult and challenging task. Many professionals work in noisy environments like aviation, constructions, or manufacturing, and find it difficult to communicate orally. Such noisy environments need an automated lip-reading system that could be helpful in communicating some instructions and commands. This paper proposes a novel lip-reading solution, which extracts the geometrical shape of lip movement from the video and predicts the words/sentences spoken. An Indian specific language data set is developed which consists of lip movement information captured from 50 persons. This includes students in the age group of 18 to 20 years and faculty in the age group of 25 to 40 years . All have spoken a paragraph of 58 words within 10 sentences in Hindi (Devanagari, spoken in India) language which was recorded under various conditions. The implementation consists of facial parts detection, along with Long short term memory’s. The proposed solution is able to predict the words spoken with 77% and 35% accuracy for data set of 3 and 10 words respectively. The sentences are predicted with 20% accuracy, which is encouraging.
Publisher
Ediciones Universidad de Salamanca
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Driver Stress Detection in Simulated Driving Scenarios with Photoplethysmography;Distributed Computing and Artificial Intelligence, 19th International Conference;2022-12-13
2. Object Recognition-Driven Cultural Travel Guide for the Coffee Cultural Landscape of Colombia;Highlights in Practical Applications of Agents, Multi-Agent Systems, and Complex Systems Simulation. The PAAMS Collection;2022