Author:
Sumon Md. Shaheenur Islam,Ali Muttakee Bin,Bari Samiul,Ohi Ipshita Rahman,Islam Mayisha,Rahman Syed Mahfuzur
Abstract
Abstract
Sign language is the most effective communication for deaf or hard-of-hearing people. Specialized training is required to understand sign language, and as such, people without disabilities around them cannot communicate effectively. The main objective of this study is to develop a mechanism for streamlining the deep learning model for sign language recognition by utilizing the 30 most prevalent words in our everyday lives. The dataset was designed through 30 ASL (American Sign Language) words consisting of custom-processed video sequences, which consist of 5 subjects and 50 sample videos for each class. The CNN model can be applied to video frames to extract spatial properties. Using CNN’s acquired data, the LSTM model may then predict the action being performed in the video. We present and evaluate the results of two separate datasets—the Pose dataset and the Raw video dataset. The dataset was trained with the Long-term Recurrent Convolutional Network (LRCN) approach. Finally, a test accuracy of 92.66% was reached for the raw dataset, while 93.66% for the pose dataset.
Reference25 articles.
1. Natural sign languages;Sandler,2003
2. American Sign Language as a heritage language;Compton,2014
3. How many people use ASL in the United States? Why estimates need updating;Mitchell;Sign Language Studies,2006
4. The deaf and dumb: Or, a collection of articles relating to the condition of deaf mutes; their education, and the principal asylums devoted to their instruction;Mann,1836