An Attention-Enhanced Multi-Scale and Dual Sign Language Recognition Network Based on a Graph Convolution Network-Reference-Cited by-同舟云学术

An Attention-Enhanced Multi-Scale and Dual Sign Language Recognition Network Based on a Graph Convolution Network

Published:2021-02-05 Issue:4 Volume:21 Page:1120
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Meng Lu,Li Ronghui

Abstract

Sign language is the most important way of communication for hearing-impaired people. Research on sign language recognition can help normal people understand sign language. We reviewed the classic methods of sign language recognition, and the recognition accuracy is not high enough because of redundant information, human finger occlusion, motion blurring, the diversified signing styles of different people, and so on. To overcome these shortcomings, we propose a multi-scale and dual sign language recognition Network (SLR-Net) based on a graph convolutional network (GCN). The original input data was RGB videos. We first extracted the skeleton data from them and then used the skeleton data for sign language recognition. SLR-Net is mainly composed of three sub-modules: multi-scale attention network (MSA), multi-scale spatiotemporal attention network (MSSTA) and attention enhanced temporal convolution network (ATCN). MSA allows the GCN to learn the dependencies between long-distance vertices; MSSTA can directly learn the spatiotemporal features; ATCN allows the GCN network to better learn the long temporal dependencies. The three different attention mechanisms, multi-scale attention mechanism, spatiotemporal attention mechanism, and temporal attention mechanism, are proposed to further improve the robustness and accuracy. Besides, a keyframe extraction algorithm is proposed, which can greatly improve efficiency by sacrificing a little accuracy. Experimental results showed that our method can reach 98.08% accuracy rate in the CSL-500 dataset with a 500-word vocabulary. Even on the challenging dataset DEVISIGN-L with a 2000-word vocabulary, it also reached a 64.57% accuracy rate, outperforming other state-of-the-art sign language recognition methods.

Funder

National Key Research and Development Program of China

National Natural Science Foundation of China

Fundamental Research Funds for the Central Universities

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/21/4/1120/pdf

Reference55 articles.

1. A Framework for Hand Gesture Recognition Based on Accelerometer and EMG Sensors;Xu;IEEE Trans. Syst. Man Cybern. Part A Syst. Hum.,2011

2. Machine learning based sign language recognition: a review and its research frontier

3. Sign Language Recognition Systems: A Decade Systematic Literature Review

4. A review of hand gesture and sign language recognition techniques

5. Convolutional and recurrent neural network for human activity recognition: Application on American sign language

Cited by 25 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Structure-aware sign language recognition with spatial–temporal scene graph;Information Processing & Management;2024-11

2. Hand-Aware Graph Convolution Network for Skeleton-Based Sign Language Recognition;Journal of Information and Intelligence;2024-08

3. Isolated Arabic Sign Language Recognition Using a Transformer-based Model and Landmark Keypoints;ACM Transactions on Asian and Low-Resource Language Information Processing;2024-01-15

4. Isolated Japanese Sign Language Recognition Based on Image Entropy Variation Rate and Score-Level Multi-Cue Fusion;2024 2nd International Conference on Computer Graphics and Image Processing (CGIP);2024-01-12

5. Sign Language Recognition: A Comprehensive Review of Traditional and Deep Learning Approaches, Datasets, and Challenges;IEEE Access;2024