Deep Learning Methods for Sign Language Translation

Author:

Ananthanarayana Tejaswini1,Srivastava Priyanshu1,Chintha Akash1,Santha Akhil1,Landy Brian1,Panaro Joseph1,Webster Andre1,Kotecha Nikunj1,Sah Shagan1,Sarchet Thomastine1,Ptucha Raymond1,Nwogu Ifeoma1

Affiliation:

1. Rochester Institute of Technology, Rochester, New York

Abstract

Many sign languages are bona fide natural languages with grammatical rules and lexicons hence can benefit from machine translation methods. Similarly, since sign language is a visual-spatial language, it can also benefit from computer vision methods for encoding it. With the advent of deep learning methods in recent years, significant advances have been made in natural language processing (specifically neural machine translation) and in computer vision methods (specifically image and video captioning). Researchers have therefore begun expanding these learning methods to sign language understanding. Sign language interpretation is especially challenging, because it involves a continuous visual-spatial modality where meaning is often derived based on context. The focus of this article, therefore, is to examine various deep learning–based methods for encoding sign language as inputs, and to analyze the efficacy of several machine translation methods, over three different sign language datasets. The goal is to determine which combinations are sufficiently robust for sign language translation without any gloss-based information. To understand the role of the different input features, we perform ablation studies over the model architectures (input features + neural translation models) for improved continuous sign language translation. These input features include body and finger joints, facial points, as well as vector representations/embeddings from convolutional neural networks. The machine translation models explored include several baseline sequence-to-sequence approaches, more complex and challenging networks using attention, reinforcement learning, and the transformer model. We implement the translation methods over multiple sign languages—German (GSL), American (ASL), and Chinese sign languages (CSL). From our analysis, the transformer model combined with input embeddings from ResNet50 or pose-based landmark features outperformed all the other sequence-to-sequence models by achieving higher BLEU2-BLEU4 scores when applied to the controlled and constrained GSL benchmark dataset. These combinations also showed significant promise on the other less controlled ASL and CSL datasets.

Funder

National Science Foundation

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Science Applications,Human-Computer Interaction

Reference113 articles.

1. Biyi Fang Jillian Co and Mi Zhang. 2017. DeepASL: Enabling ubiquitous and non-intrusive word and sentence-level sign language translation. In Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems . 1–13.

2. Openpose 2D. 2018. Retrieved January 4 2020 from https://github.com/CMU-Perceptual-Computing-Lab/openpose.

3. Nayyer Aafaq Syed Zulqarnain Gilani Wei Liu and Ajmal Mian. 2018. Video description: A survey of methods datasets and evaluation metrics. ACM Comput. Surv. 52 6 Article 115 (Oct. 2019) 37 pages. https://doi.org/10.1145/3355390

4. Dzmitry Bahdanau Kyunghyun Cho and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15) San Diego CA USA May 7-9 2015 Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1409.0473

5. Dzmitry Bahdanau Kyunghyun Cho and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations ICLR 2015 San Diego CA USA May 7-9 2015 Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1409.0473

Cited by 19 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Reviewing 25 years of continuous sign language recognition research: Advances, challenges, and prospects;Information Processing & Management;2024-09

2. Non-Autoregressive Sign Language Translation: A Transformer Encoder-only Approach for Enhanced Multimodal Integration;Proceedings of the 2024 International Conference on Advanced Robotics, Automation Engineering and Machine Learning;2024-06-28

3. INTERPRETING AND TRANSLATING THE KOREAN LANGUAGE BASED ON THE MACHINE TRANSLATION MODEL FOR COLLEGE STUDENTS;ACM Transactions on Asian and Low-Resource Language Information Processing;2024-06-27

4. Unveiling the Power of Machine Learning and Deep Learning in Advancing American Sign Language Recognition;2024 International Conference on Cognitive Robotics and Intelligent Systems (ICC - ROBINS);2024-04-17

5. Highly parallel and ultra-low-power probabilistic reasoning with programmable gaussian-like memory transistors;Nature Communications;2024-03-18

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3