Multimodal Lip-Reading for Tracheostomy Patients in the Greek Language-Reference-Cited by-同舟云学术

Multimodal Lip-Reading for Tracheostomy Patients in the Greek Language

Published:2022-02-28 Issue:3 Volume:11 Page:34
ISSN:2073-431X
Container-title:Computers
language:en
Short-container-title:Computers

Author:

Voutos Yorghos^ORCID,Drakopoulos Georgios^ORCID,Chrysovitsiotis Georgios^ORCID,Zachou Zoi^ORCID,Kikidis Dimitris^ORCID,Kyrodimos Efthymios^ORCID,Exarchos Themis

Abstract

Voice loss constitutes a crucial disorder which is highly associated with social isolation. The use of multimodal information sources, such as, audiovisual information, is crucial since it can lead to the development of straightforward personalized word prediction models which can reproduce the patient’s original voice. In this work we designed a multimodal approach based on audiovisual information from patients before loss-of-voice to develop a system for automated lip-reading in the Greek language. Data pre-processing methods, such as, lip-segmentation and frame-level sampling techniques were used to enhance the quality of the imaging data. Audio information was incorporated in the model to automatically annotate sets of frames as words. Recurrent neural networks were trained on four different video recordings to develop a robust word prediction model. The model was able to correctly identify test words in different time frames with 95% accuracy. To our knowledge, this is the first word prediction model that is trained to recognize words from video recordings in the Greek language.

Funder

Hellenic Foundation for Research and Innovation, project number 579, Acronym Let's Talk

Publisher

MDPI AG

Subject

Computer Networks and Communications,Human-Computer Interaction

Link

https://www.mdpi.com/2073-431X/11/3/34/pdf

Reference28 articles.

1. Tracheostomy: Epidemiology, Indications, Timing, Technique, and Outcomes

2. Continuation of Smoking after Treatment of Laryngeal Cancer: An Independent Prognostic Factor?

3. Rehabilitation after Total Laryngectomy—A Tribute to the Pioneers of Voice Restoration in the Last Two Centuries

4. Surgical voice restoration after total laryngectomy: An overview

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. CNN based lip-reading system for visual input: A review;AIP Conference Proceedings;2024

2. Learning the Relative Dynamic Features for Word-Level Lipreading;Sensors;2022-05-13