An efficient framework for obtaining the initial cluster centers

Author:

Mishra B. K.,Mohanty Sachi Nandan,Baidyanath R. R.,Ali Shahid,Abduvalieva D.,Awwad Fuad A.,Ismail Emad A. A.,Gupta Manish

Abstract

AbstractClustering is an important tool for data mining since it can determine key patterns without any prior supervisory information. The initial selection of cluster centers plays a key role in the ultimate effect of clustering. More often researchers adopt the random approach for this purpose in an urge to get the centers in no time for speeding up their model. However, by doing this they sacrifice the true essence of subgroup formation and in numerous occasions ends up in achieving malicious clustering. Due to this reason we were inclined towards suggesting a qualitative approach for obtaining the initial cluster centers and also focused on attaining the well-separated clusters. Our initial contributions were an alteration to the classical K-Means algorithm in an attempt to obtain the near-optimal cluster centers. Few fresh approaches were earlier suggested by us namely, far efficient K-means (FEKM), modified center K-means (MCKM) and modified FEKM using Quickhull (MFQ) which resulted in producing the factual centers leading to excellent clusters formation. K-means, which randomly selects the centers, seem to meet its convergence slightly earlier than these methods, which is the latter’s only weakness. An incessant study was continued in this regard to minimize the computational efficiency of our methods and we came up with farthest leap center selection (FLCS). All these methods were thoroughly analyzed by considering the clustering effectiveness, correctness, homogeneity, completeness, complexity and their actual execution time of convergence. For this reason performance indices like Dunn’s Index, Davies–Bouldin’s Index, and silhouette coefficient were used, for correctness Rand measure was used, for homogeneity and completeness V-measure was used. Experimental results on versatile real world datasets, taken from UCI repository, suggested that both FEKM and FLCS obtain well-separated centers while the later converges earlier.

Publisher

Springer Science and Business Media LLC

Subject

Multidisciplinary

Reference57 articles.

1. Odell, P. L. & Duran, B. S. Cluster Analysis; A Survey. Lecture Notes in Economics and Mathematical Systems Vol. 100 (LNE, 1974).

2. Na, S., Xumin, L. and Yong, G. Research on K-means clustering algorithm—an improved K-means clustering algorithm. In IEEE 3rd Int. Symposium on Intelligent Info. Technology and Security Informatics, pp. 63–67 (2010).

3. Xu, R. & Wunsch, D. Survey of clustering algorithms. IEEE Trans. Neural Netw. 16, 645–678 (2005).

4. Cheung, Y. M. A new generalized K-means clustering algorithm. Pattern Recogn. Lett. 24, 2883–2893 (2003).

5. Li, S. Cluster center initialization method for K-means algorithm over data sets with two clusters. Int. Conf. Adv. Eng. 24, 324–328 (2011).

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3