Multi-state feature optimization of sign glosses for continuous sign language recognition-Reference-Cited by-同舟云学术

Multi-state feature optimization of sign glosses for continuous sign language recognition

Published:2023-10-04 Issue:4 Volume:45 Page:6645-6654
ISSN:1064-1246
Container-title:Journal of Intelligent & Fuzzy Systems
language:
Short-container-title:IFS

Author:

Lin Tao¹,Chen Biao¹,Wang Ruixia¹,Zhang Yabo¹,Shi Yu¹,Jiang Nan¹

Affiliation:

1. School of Computer Science & Information Engineering, Shanghai Institute of Technology, Shanghai, China

Abstract

Vision-based Continuous Sign Language Recognition (CSLR) is a challenging and weakly supervised task aimed at segmenting sign language from weakly annotated image stream sequences for recognition. Compared with Isolated Sign Language Recognition (ISLR), the biggest challenge of this work is that the image stream sequences have ambiguous time boundaries. Recent CSLR works have shown that the visual-level sign language recognition task focuses on image stream feature extraction and feature alignment, and overfitting is the most critical problem in the CSLR training process. After investigating the advanced CSLR models in recent years, we have identified that the key to this study is the adequate training of the feature extractor. Therefore, this paper proposes a CSLR model with Multi-state Feature Optimization (MFO), which is based on Fully Convolutional Network (FCN) and Connectionist Temporal Classification (CTC). The MFO mechanism supervises the multiple states of each Sign Gloss in the modeling process and provides more refined labels for training the CTC decoder, which can effectively solve the overfitting problem caused by training, while also significantly reducing the training cost in time. We validate the MFO method on the popular CSLR dataset and demonstrate that the model has better performance.

Publisher

IOS Press

Subject

Artificial Intelligence,General Engineering,Statistics and Probability

Reference35 articles.

1. A robust method to authenticate car license plates using segmentation and ROI based approach;Aggarwal;Smart and Sustainable Built Environment,2020

2. Supervised Sequence Labelling

3. Graves A. , Fernández S. , Gomez F. and Schmidhuber J. , Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, in: Proceedings of the 23rd International Conference on Machine learning, 2006, pp. 369–376.

4. Self-Mutual Distillation Learning for Continuous Sign Language Recognition;Hao;Proceedings of the IEEE/CVF International Conference on Computer Vision,2021

5. SARWAS: Deep ensemble learning techniques for sentiment based recommendation system;Choudhary;Expert Systems with Applications,2023