Affiliation:
1. Department of Electronics and Communication Engineering, Faculty of Engineering and Technology, SRM Institute of Science and Technology, Delhi - NCR Campus, Ghaziabad, UP, India,
Abstract
Voice signals are the essential input source for applications based on human and computer interaction technology. Gender identification through voice signals is one of the most challenging tasks. For voice signal based analysis, deep learning algorithms provide an alternative to traditional and conventional algorithms for classification. To identify the gender through voice signals of female, male and ‘first-time’ transgender, the deep learning algorithm is used to improve the robustness of the identification model with the Mel Frequency Cepstrum Coefficients (MFCC) as a feature of the voice signals. This article presents the identification accuracy of gender with the help of recorded live voice signals. The voice samples of the third gender are recorded in the Hindi language. These Hindi language voice samples of transgender are very low resources and are unavailable at any recognized sources. The simulation results do not depend on the duration of the signals and are text independent. The recurrent neural network – Bidirectional Long Short-term Memory (RNN – BiLSTM) algorithm has been simulated on the recorded voice signals. The simulation outcome is compared with the earlier reported results in the literature. The gender-wise average accuracy of the proposed model is achieved as 91.44%, 94.94%, and 96.11% for males, females, and transgender, respectively, using voice signals. The identification accuracy of transgender is high in comparison to other genders. On the other hand, the average accuracy of the proposed model is obtained as 94.16%.
Publisher
Association for Computing Machinery (ACM)