Speaker recognition based on characteristic spectrograms and an improved self-organizing feature map neural network-Reference-Cited by-同舟云学术

Speaker recognition based on characteristic spectrograms and an improved self-organizing feature map neural network

Published:2020-06-29 Issue: Volume: Page:
ISSN:2199-4536
Container-title:Complex & Intelligent Systems
language:en
Short-container-title:Complex Intell. Syst.

Author:

Jia Yanjie,Chen Xi,Yu Jieqiong,Wang Lianming,Xu Yuanzhe,Liu Shaojin,Wang Yonghui

Abstract

AbstractTo obtain a speaker’s pronunciation characteristics, a method is proposed based on an idea from bionics, which uses spectrogram statistics to achieve a characteristic spectrogram to give a stable representation of the speaker’s pronunciation from a linear superposition of short-time spectrograms. To deal with the issue of slow network training and recognition speed for speaker recognition systems on resource-constrained devices, based on a traditional SOM neural network, an adaptive clustering self-organizing feature map SOM (AC-SOM) algorithm is proposed. This algorithm automatically adjusts the number of neurons in the competition layer based on the number of speakers to be recognized until the number of clusters matches the number of speakers. A 100-speaker database of characteristic spectrogram samples was built and applied to the proposed AC-SOM model, yielding a maximum training time of only 304 s, with a maximum sample recognition time of less than 28 ms. Comparing to other approaches, the proposed method offers greatly improved training and recognition speed without sacrificing too much recognition accuracy. The promising results suggest that the proposed method satisfies real-time data processing and execution requirements for edge intelligence systems better than other speaker recognition methods.

Funder

National Natural Science Foundation of China

Jilin Scientific and Technological Development Program

Publisher

Springer Science and Business Media LLC

Subject

General Earth and Planetary Sciences,General Environmental Science

Link

https://link.springer.com/content/pdf/10.1007/s40747-020-00172-1.pdf

Reference26 articles.

1. Kinnunen T, Li H (2010) An overview of text-independent speaker recognition: from features to supervectors. Speech Commun 52(1):12–40

2. Singh N, Khan RA, Shree R (2012) Applications of speaker recognition. Proced Eng 38(1):3122–3126