Author:
Ahmed Benyamina,Soumia Benkrama,Bentaib Mohammed Yazid
Abstract
This article investigates the development and evaluation of a speaker identification system using deep learning techniques, with a focus on Convolutional Neural Networks (CNNs) and the audioMNIST dataset. The study reveals significant advancements in speaker identification, demonstrating substantial improvements over state-of-the-art models. Our system achieves high accuracy and reliability in distinguishing speakers, showcasing its potential applications in forensic science, security, and privacy protection. The paper thoroughly examines audio signal representation, preprocessing techniques, and feature extraction methods, highlighting how these components contribute to the system's effectiveness. By leveraging CNNs, the proposed system provides highly accurate speaker identification and exhibits robustness in various conditions, including noise and varying speech patterns. The findings underscore the system’s capability to enhance security measures and forensic research, paving the way for future optimizations and broader applications. This contribution expands the knowledge base in speaker identification technology, offering scalable and efficient solutions for real-world scenarios. Future research directions include refining the dataset, exploring advanced optimization techniques, and addressing ethical considerations to ensure the system's robustness and practical utility in diverse applications.
Publisher
South Florida Publishing LLC
Reference22 articles.
1. AL-QADERI, M.; LAHAMER, E.; RAD, A. A two-level speaker identification system via fusion of heterogeneous classifiers and complementary feature cooperation. Sensors, v. 21, n. 15, p. 5097, 2021. doi: 10.3390/s21155097
2. BRYDINSKYI, V. et al. Comparison of modern deep learning models for speaker verification. Applied Sciences, v. 14, n. 1, p. 102-114, 2024. Doi: 10.3390/app14010102
3. BUCHNEV, V.; HE, J.; SUN, F.; KORYAKOVSKIY, I. RUPQ: Improving low-bit quantization by equalizing relative updates of quantization parameters. In: British Machine Vision Conference (BMVC). 2023. doi: 10.48550/arXiv.2310.01234
4. BURHAN, I.; NAJDET, A.; MAHMOOD, Z. Enhancement and modification of automatic speaker verification by utilizing hidden Markov model. Indonesian Journal of Electrical Engineering and Computer Science, v. 27, p. 1397-1403, 2022. doi: 10.11591/ijeecs.v27.i3.pp1397-1403
5. FAÚNDEZ-ZANUY, M. On the model size selection for speaker identification. arXiv preprint arXiv:2204.01294. 2022. doi: 10.48550/arXiv.2204.01294