Two-Stream Mixed Convolutional Neural Network for American Sign Language Recognition-Reference-Cited by-同舟云学术

Two-Stream Mixed Convolutional Neural Network for American Sign Language Recognition

Published:2022-08-09 Issue:16 Volume:22 Page:5959
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Ma Ying,Xu Tianpei^ORCID,Kim Kangchul

Abstract

The Convolutional Neural Network (CNN) has demonstrated excellent performance in image recognition and has brought new opportunities for sign language recognition. However, the features undergo many nonlinear transformations while performing the convolutional operation and the traditional CNN models are insufficient in dealing with the correlation between images. In American Sign Language (ASL) recognition, J and Z with moving gestures bring recognition challenges. This paper proposes a novel Two-Stream Mixed (TSM) method with feature extraction and fusion operation to improve the correlation of feature expression between two time-consecutive images for the dynamic gestures. The proposed TSM-CNN system is composed of preprocessing, the TSM block, and CNN classifiers. Two consecutive images in the dynamic gesture are used as inputs of streams, and resizing, transformation, and augmentation are carried out in the preprocessing stage. The fusion feature map obtained by addition and concatenation in the TSM block is used as inputs of the classifiers. Finally, a classifier classifies images. The TSM-CNN model with the highest performance scores depending on three concatenation methods is selected as the definitive recognition model for ASL recognition. We design 4 CNN models with TSM: TSM-LeNet, TSM-AlexNet, TSM-ResNet18, and TSM-ResNet50. The experimental results show that the CNN models with the TSM are better than models without TSM. The TSM-ResNet50 has the best accuracy of 97.57% for MNIST and ASL datasets and is able to be applied to a RGB image sensing system for hearing-impaired people.

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/22/16/5959/pdf

Reference47 articles.

1. World Federation of the Deaf (WFD) https://wfdeaf.org

2. National Institute on Deafness and Other Communication Disorders (NIDCD)

3. Sign Language Recognition: A Deep Survey

4. Large-scale isolated gesture recognition using convolutional neural networks;Wang;Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR),2016

5. MultiD-CNN: A multi-dimensional feature learning approach based on deep convolutional networks for gesture recognition in RGB-D image sequences

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multi-Stream Isolated Sign Language Recognition Based on Finger Features Derived from Pose Data;Electronics;2024-04-22

2. Sign Language Gestures Recognition using CNN and Inception v3;2024 International Conference on Emerging Smart Computing and Informatics (ESCI);2024-03-05

3. HGR-FYOLO: a robust hand gesture recognition system for the normal and physically impaired person using frozen YOLOv5;Multimedia Tools and Applications;2024-02-13

4. A Comprehensive Study on Relative Distances of Hand Landmarks Approach for American Sign Language Gesture;Augmented Human Research;2024-02-09

5. Multimodal fusion hierarchical self-attention network for dynamic hand gesture recognition;Journal of Visual Communication and Image Representation;2024-02