Affiliation:
1. Department of Computer Science and Technology, Kean University, Union, NJ 07083, USA
Abstract
In an era where artificial intelligence (AI) bridges crucial communication gaps, this study extends AI’s utility to American and Taiwan Sign Language (ASL and TSL) communities through advanced models like the hierarchical vision transformer with shifted windows (Swin). This research evaluates Swin’s adaptability across sign languages, aiming for a universal platform for the unvoiced. Utilizing deep learning and transformer technologies, it has developed prototypes for ASL-to-English translation, supported by an educational framework to facilitate learning and comprehension, with the intention to include more languages in the future. This study highlights the efficacy of the Swin model, along with other models such as the vision transformer with deformable attention (DAT), ResNet-50, and VGG-16, in ASL recognition. The Swin model’s accuracy across various datasets underscore its potential. Additionally, this research explores the challenges of balancing accuracy with the need for real-time, portable language recognition capabilities and introduces the use of cutting-edge transformer models like Swin, DAT, and video Swin transformers for diverse datasets in sign language recognition. This study explores the integration of multimodality and large language models (LLMs) to promote global inclusivity. Future efforts will focus on enhancing these models and expanding their linguistic reach, with an emphasis on real-time translation applications and educational frameworks. These achievements not only advance the technology of sign language recognition but also provide more effective communication tools for the deaf and hard-of-hearing community.
Reference59 articles.
1. (2024, February 24). Home Page of the NAD. Available online: https://www.nad.org/resources/american-sign-language/learning-american-sign-language/.
2. (2024, February 24). Home Page of the NAD Youth. Available online: https://youth.nad.org/.
3. (2024, February 24). GitHub Repository of Swin Transformer. Available online: https://github.com/microsoft/Swin-Transformer.
4. Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 11–17). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada.
5. (2024, February 24). GitHub Repository of DAT Transformer. Available online: https://github.com/LeapLabTHU/DAT.