Research on Clustering Analysis of Big Data-Reference-Cited by-同舟云学术

Research on Clustering Analysis of Big Data

Published:2012-09 Issue: Volume:6-7 Page:82-87
ISSN:2234-991X
Container-title:Advanced Engineering Forum
language:
Short-container-title:AEF

Author:

Yuan Yuan Ming¹,Wu Chan Le¹

Affiliation:

1. Wuhan University

Abstract

Data quantity of Big Data was too big to be processed with traditional clustering analysis technologies. Time consuming was long, problem of computability existed with traditional technologies. Having analyzed on k-means clustering algorithm, a new algorithm was proposed. Parallelizing part of k-means was found. The algorithm was improved with the method of redesigning flow with MapReduce framework. Problems mentioned above were solved. Experiments show that new algorithm is feasible and effective.

Publisher

Trans Tech Publications, Ltd.

Link

https://www.scientific.net/AEF.6-7.82.pdf

Reference4 articles.

1. Ralf Lammel, Data Programmability Team. Google's MapReduce Programmig Model-Revisited. Redmond, WA, USA: Microsoft Corp. (2007).

2. Jeffrey Dean, Sanjay Ghemawat. MapReduce: Simplified Data Processing on Large Clusters, Communications of the ACM, vol. 51, no . 1(2008), pp . 107-113.

3. Hadoop Community. Hadoop Distributed File System, http: /hadoop. apache. org/hdfs (2010).

4. J. A. Hartigan and M. A. Wong. Algorithm AS 136: A K-Means Clustering Algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 28, No. 1 (1979), pp.100-108.

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Big data analysis and research on consumption demand of sports fitness leisure activities;Cluster Computing;2018-03-07

2. Volume-Based Data Representation of Big Data Analysis;Advanced Materials Research;2013-09