An Optimized Clustering Quality Analysis in K-Means Cluster Using Silhouette Scores-Reference-Cited by-同舟云学术

An Optimized Clustering Quality Analysis in K-Means Cluster Using Silhouette Scores

Published:2024-05-20 Issue: Volume: Page:49-63
ISSN:2327-0411
Container-title:Advances in Computational Intelligence and Robotics
language:
Short-container-title:

Author:

Ilyas F. Mohamed¹,Priscila S. Silvia¹

Affiliation:

1. Bharath Institute of Higher Education and Research, India

Abstract

Data-driven problem-solving requires the capacity to use cutting-edge computational methods to explain fundamental phenomena to a large audience. These facilities are needed for political and social studies. Quantitative methods often involve knowledge of concepts, trends, and facts that affect the study programme. Researchers often don't know the data's structure or assumptions when analysing it. Data exploration may also obscure social science research methodology instruction. It was essential applied research before predictive modelling and hypothesis testing. Clustering is part of data mining and picking the right cluster count is key to improving predictive model accuracy for large datasets. Unsupervised machine learning (ML) algorithm K-means is popular. The method usually finds discrete, non-overlapping clusters with groups for each location. It can be difficult to choose the best k-means approach. In the human freedom index (HFI) dataset, the mini batch k-mean (MBK-mean) using the Hamely method reduces iteration and increases cluster efficiency. The silhouette score algorithm from Scikit-learn was used to obtain the average silhouette co-efficient of all samples for various cluster counts. A cluster with fewer negative values is considered best. Additionally, the silhouette with the greatest score has the optimum clusters.

Publisher

IGI Global

Reference48 articles.

1. A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

2. A blockchain‐enabled security management framework for mobile edge computing

3. Distracted driver detection using compressed energy efficient convolutional neural network

4. Fusion of deep learning based cyberattack detection and classification model for intelligent systems

5. Orchestrating single-cell analysis with Bioconductor