Continuous Sign Language Recognition through a Context-Aware Generative Adversarial Network-Reference-Cited by-同舟云学术

Continuous Sign Language Recognition through a Context-Aware Generative Adversarial Network

Published:2021-04-01 Issue:7 Volume:21 Page:2437
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Papastratis Ilias^ORCID,Dimitropoulos Kosmas^ORCID,Daras Petros^ORCID

Abstract

Continuous sign language recognition is a weakly supervised task dealing with the identification of continuous sign gestures from video sequences, without any prior knowledge about the temporal boundaries between consecutive signs. Most of the existing methods focus mainly on the extraction of spatio-temporal visual features without exploiting text or contextual information to further improve the recognition accuracy. Moreover, the ability of deep generative models to effectively model data distribution has not been investigated yet in the field of sign language recognition. To this end, a novel approach for context-aware continuous sign language recognition using a generative adversarial network architecture, named as Sign Language Recognition Generative Adversarial Network (SLRGAN), is introduced. The proposed network architecture consists of a generator that recognizes sign language glosses by extracting spatial and temporal features from video sequences, as well as a discriminator that evaluates the quality of the generator’s predictions by modeling text information at the sentence and gloss levels. The paper also investigates the importance of contextual information on sign language conversations for both Deaf-to-Deaf and Deaf-to-hearing communication. Contextual information, in the form of hidden states extracted from the previous sentence, is fed into the bidirectional long short-term memory module of the generator to improve the recognition accuracy of the network. At the final stage, sign language translation is performed by a transformer network, which converts sign language glosses to natural language text. Our proposed method achieved word error rates of 23.4%, 2.1% and 2.26% on the RWTH-Phoenix-Weather-2014 and the Chinese Sign Language (CSL) and Greek Sign Language (GSL) Signer Independent (SI) datasets, respectively.

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/21/7/2437/pdf

Reference64 articles.

1. 3D Technologies and Applications in Sign Language

2. MS-ASL: A Large-Scale Data Set and Benchmark for Understanding American Sign Language;Joze;arXiv,2018

Cited by 39 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Quantifying inconsistencies in the Hamburg Sign Language Notation System;Expert Systems with Applications;2024-12

2. Cross-modal knowledge distillation for continuous sign language recognition;Neural Networks;2024-11

3. TB-Net: Intra- and inter-video correlation learning for continuous sign language recognition;Information Fusion;2024-09

4. Word separation in continuous sign language using isolated signs and post-processing;Expert Systems with Applications;2024-09

5. Reviewing 25 years of continuous sign language recognition research: Advances, challenges, and prospects;Information Processing & Management;2024-09