Affiliation:
1. Department Machine learning and Information Retrieval, Faculty of Artificial Intelligence Kafrelsheikh University Kafr El Sheikh Egypt
2. Department of Software Engineering, College of Computer and Information Sciences King Saud University Riyadh Saudi Arabia
3. Department of Computer Science, Faculty of Computers and Information Mansoura University Mansoura Egypt
4. Information Technology Department, Faculty of Computers and Artificial intelligence Damietta University Egypt
Abstract
AbstractPose estimation is a computer vision task used to detect and estimate the pose of a person or an object in images or videos. It has some challenges that can leverage advances in computer vision research and others that require efficient solutions. In this paper, we provide a preliminary review of the state‐of‐the‐art in pose estimation, including both traditional and deep learning approaches. Also, we implement and compare the performance of Hand Pose Estimation (HandPE), which uses PoseNet architecture for hand sign problems, for an ASL dataset by using different optimizers based on 10 common evaluation metrics on different datasets. Also, we discuss some related future research directions in the field of pose estimation and explore new architectures for pose estimation types. After applying the PoseNet model, the experiment results showed that the accuracy achieved was 99.9%, 89%, 97%, 79%, and 99% for the ASL alphabet, HARPET, Yoga, Animal, and Head datasets, comparing those with common optimizers and evaluation metrics on different dataset.