Abstract
Deep neural networks (DNNs) can face limitations during training for recognition, motivating this study to improve recognition capabilities by optimizing deep learning features for hand gesture image recognition. We propose a novel approach that enhances features from well-trained DNNs using an improved radial basis function (RBF) neural network, targeting recognition within individual gesture categories. We achieve this by clustering images with a self-organizing map (SOM) network to identify optimal centers for RBF training. Our enhanced SOM, employing the Hassanat distance metric, outperforms the traditional K-Means method across a comparative analysis of various distance functions and the expanded number of cluster centers, accurately identifying hand gestures in images. Our training pipeline learns from hand gesture videos and static images, addressing the growing need for machines to interact with gestures. Despite challenges posed by gesture videos, such as sensitivity to hand pose sequences within a single gesture category and overlapping hand poses due to the high similarities and repetitions, our pipeline achieved significant enhancement without requiring time-related training data. We also improve the recognition of static hand pose images within the same category. Our work advances DNNs by integrating deep learning features and incorporating SOM for RBF training.