Isolated single sound lip-reading using a frame-based camera and event-based camera-Reference-Cited by-同舟云学术

Isolated single sound lip-reading using a frame-based camera and event-based camera

Published:2023-01-11 Issue: Volume:5 Page:
ISSN:2624-8212
Container-title:Frontiers in Artificial Intelligence
language:
Short-container-title:Front. Artif. Intell.

Author:

Kanamaru Tatsuya,Arakane Taiki,Saitoh Takeshi

Abstract

Unlike the conventional frame-based camera, the event-based camera detects changes in the brightness value for each pixel over time. This research work on lip-reading as a new application by the event-based camera. This paper proposes an event camera-based lip-reading for isolated single sound recognition. The proposed method consists of imaging from event data, face and facial feature points detection, and recognition using a Temporal Convolutional Network. Furthermore, this paper proposes a method that combines the two modalities of the frame-based camera and an event-based camera. In order to evaluate the proposed method, the utterance scenes of 15 Japanese consonants from 20 speakers were collected using an event-based camera and a video camera and constructed an original dataset. Several experiments were conducted by generating images at multiple frame rates from an event-based camera. As a result, the highest recognition accuracy was obtained in the image of the event-based camera at 60 fps. Moreover, it was confirmed that combining two modalities yields higher recognition accuracy than a single modality.

Publisher

Frontiers Media SA

Subject

Artificial Intelligence

Reference23 articles.

1. Deep lip reading: a comparison of models and an online application,;Afouras;Interspeech 2018,2018

2. LipNet: end-to-end sentence-level lipreading;Assael;arXiv:1611.01599,2016

3. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling;Bai;arXiv preprint,2018

4. Lip reading sentences in the wild,;Chung;IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2017

5. Lip reading in the wild,;Chung;Asian Conference on Computer Vision (ACCV),2016

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Recent Advances in Bio-Inspired Vision Sensor: A Review;Journal of Circuits, Systems and Computers;2024-07-10

2. KuchiNavi: lip-reading-based navigation app;Fifteenth International Conference on Graphics and Image Processing (ICGIP 2023);2024-03-25

3. Faces in Event Streams (FES): An Annotated Face Dataset for Event Cameras;Sensors;2024-02-22

4. Can you read lips with a masked face?;2023 18th International Conference on Machine Vision and Applications (MVA);2023-07-23

5. Efficient DNN Model for Word Lip-Reading;Algorithms;2023-05-27