Deep Learning Methods for Sign Language Translation-Reference-Cited by-同舟云学术

Deep Learning Methods for Sign Language Translation

Published:2021-12-31 Issue:4 Volume:14 Page:1-30
ISSN:1936-7228
Container-title:ACM Transactions on Accessible Computing
language:en
Short-container-title:ACM Trans. Access. Comput.

Author:

Ananthanarayana Tejaswini¹,Srivastava Priyanshu¹,Chintha Akash¹,Santha Akhil¹,Landy Brian¹,Panaro Joseph¹,Webster Andre¹,Kotecha Nikunj¹,Sah Shagan¹,Sarchet Thomastine¹,Ptucha Raymond¹,Nwogu Ifeoma¹

Affiliation:

1. Rochester Institute of Technology, Rochester, New York

Abstract

Many sign languages are bona fide natural languages with grammatical rules and lexicons hence can benefit from machine translation methods. Similarly, since sign language is a visual-spatial language, it can also benefit from computer vision methods for encoding it. With the advent of deep learning methods in recent years, significant advances have been made in natural language processing (specifically neural machine translation) and in computer vision methods (specifically image and video captioning). Researchers have therefore begun expanding these learning methods to sign language understanding. Sign language interpretation is especially challenging, because it involves a continuous visual-spatial modality where meaning is often derived based on context. The focus of this article, therefore, is to examine various deep learning–based methods for encoding sign language as inputs, and to analyze the efficacy of several machine translation methods, over three different sign language datasets. The goal is to determine which combinations are sufficiently robust for sign language translation without any gloss-based information. To understand the role of the different input features, we perform ablation studies over the model architectures (input features + neural translation models) for improved continuous sign language translation. These input features include body and finger joints, facial points, as well as vector representations/embeddings from convolutional neural networks. The machine translation models explored include several baseline sequence-to-sequence approaches, more complex and challenging networks using attention, reinforcement learning, and the transformer model. We implement the translation methods over multiple sign languages—German (GSL), American (ASL), and Chinese sign languages (CSL). From our analysis, the transformer model combined with input embeddings from ResNet50 or pose-based landmark features outperformed all the other sequence-to-sequence models by achieving higher BLEU2-BLEU4 scores when applied to the controlled and constrained GSL benchmark dataset. These combinations also showed significant promise on the other less controlled ASL and CSL datasets.

Funder

National Science Foundation

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Science Applications,Human-Computer Interaction

Link

https://dl.acm.org/doi/pdf/10.1145/3477498

Reference113 articles.

1. Biyi Fang Jillian Co and Mi Zhang. 2017. DeepASL: Enabling ubiquitous and non-intrusive word and sentence-level sign language translation. In Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems . 1–13.

2. Openpose 2D. 2018. Retrieved January 4 2020 from https://github.com/CMU-Perceptual-Computing-Lab/openpose.

3. Nayyer Aafaq Syed Zulqarnain Gilani Wei Liu and Ajmal Mian. 2018. Video description: A survey of methods datasets and evaluation metrics. ACM Comput. Surv. 52 6 Article 115 (Oct. 2019) 37 pages. https://doi.org/10.1145/3355390

4. Dzmitry Bahdanau Kyunghyun Cho and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15) San Diego CA USA May 7-9 2015 Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1409.0473

5. Dzmitry Bahdanau Kyunghyun Cho and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations ICLR 2015 San Diego CA USA May 7-9 2015 Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1409.0473

Cited by 19 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Reviewing 25 years of continuous sign language recognition research: Advances, challenges, and prospects;Information Processing & Management;2024-09

2. Non-Autoregressive Sign Language Translation: A Transformer Encoder-only Approach for Enhanced Multimodal Integration;Proceedings of the 2024 International Conference on Advanced Robotics, Automation Engineering and Machine Learning;2024-06-28

3. INTERPRETING AND TRANSLATING THE KOREAN LANGUAGE BASED ON THE MACHINE TRANSLATION MODEL FOR COLLEGE STUDENTS;ACM Transactions on Asian and Low-Resource Language Information Processing;2024-06-27

4. Unveiling the Power of Machine Learning and Deep Learning in Advancing American Sign Language Recognition;2024 International Conference on Cognitive Robotics and Intelligent Systems (ICC - ROBINS);2024-04-17

5. Highly parallel and ultra-low-power probabilistic reasoning with programmable gaussian-like memory transistors;Nature Communications;2024-03-18