Abstract
Recently, lip reading has become one of the most important fields of study in the field of artificial intelligence. In this study, lip reading process was performed in Turkish language using convolutional neural networks (CNNs). For this purpose, people were asked to record the numbers video (61 video), and 9 video also collected from YouTube. The dataset was collected for 20 numbers. In this study, only the video was used and the sounds were completely removed. Due to the small dataset, it was tried to reproduce with different methods. The model was trained on the train dataset and 56.25% success was achieved on the test dataset.
Publisher
Journal of Business in The Digital Age (JOBDA)
Reference12 articles.
1. Agrawal, S., & Omprakash, V. R. (2016, July). Lip reading techniques: A survey. In 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT) (pp. 753-757). IEEE.
2. Chen, X., Du, J., & Zhang, H. (2020). Lipreading with DenseNet and resBi-LSTM. Signal, Image and Video Processing, 14(5), 981-989.
3. Chung, J. S., Senior, A., Vinyals, O., & Zisserman, A. (2017, July). Lip reading sentences in the wild. In 2017 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3444-3453). IEEE.
4. Elrefaei, L. A., Alhassan, T. Q., & Omar, S. S. (2019). An Arabic visual dataset for visual speech recognition. Procedia Computer Science, 163, 400-409.
5. Faisal, M., & Manzoor, S. (2018). Deep learning for lip reading using audio-visual information for urdu language. arXiv preprint arXiv:1802.05521.