An Improved Speech Segmentation and Clustering Algorithm Based on SOM and K-Means-Reference-Cited by-同舟云学术

An Improved Speech Segmentation and Clustering Algorithm Based on SOM and K-Means

Published:2020-09-12 Issue: Volume:2020 Page:1-19
ISSN:1024-123X
Container-title:Mathematical Problems in Engineering
language:en
Short-container-title:Mathematical Problems in Engineering

Author:

Jiang Nan¹,Liu Ting²^ORCID

Affiliation:

1. Criminal Investigation Police University of China, Shenyang 110854, China

2. Liaoning University, Shenyang 110036, China

Abstract

This paper studies the segmentation and clustering of speaker speech. In order to improve the accuracy of speech endpoint detection, the traditional double-threshold short-time average zero-crossing rate is replaced by a better spectrum centroid feature, and the local maxima of the statistical feature sequence histogram are used to select the threshold, and a new speech endpoint detection algorithm is proposed. Compared with the traditional double-threshold algorithm, it effectively improves the detection accuracy and antinoise in low SNR. The k-means algorithm of conventional clustering needs to give the number of clusters in advance and is greatly affected by the choice of initial cluster centers. At the same time, the self-organizing neural network algorithm converges slowly and cannot provide accurate clustering information. An improved k-means speaker clustering algorithm based on self-organizing neural network is proposed. The number of clusters is predicted by the winning situation of the competitive neurons in the trained network, and the weights of the neurons are used as the initial cluster centers of the k-means algorithm. The experimental results of multiperson mixed speech segmentation show that the proposed algorithm can effectively improve the accuracy of speech clustering and make up for the shortcomings of the k-means algorithm and self-organizing neural network algorithm.

Funder

Natural Science Foundation of Liaoning Province

Publisher

Hindawi Limited

Subject

General Engineering,General Mathematics

Link

http://downloads.hindawi.com/journals/mpe/2020/3608286.pdf

Reference24 articles.

Cited by 20 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Chinese cities show different trend toward carbon peak;Science of The Total Environment;2024-07

2. A Study on Speech Recognition by a Neural Network Based on English Speech Feature Parameters;Journal of Advanced Computational Intelligence and Intelligent Informatics;2024-05-20

3. Adaptive Vibration Monitoring of Railway Track Structures Using the UWFBG by the Identification of Train-Load Patterns;Buildings;2024-04-26

4. Improving Automatic Forced Alignment for Phoneme Segmentation in Quranic Recitation;IEEE Access;2024

5. Feature Embedding Representation for Unsupervised Speaker Diarization in Telephone Calls;Communications in Computer and Information Science;2023-11-05