Affiliation:
1. North China Institute of Science and Technology
2. Inner Mongolia Electronic Information Vocational Technical College
Abstract
Popularity for the term Cloud-Computing has been increasing in recent years. In addition to the SQL technique, Map-Reduce, a programming model that realizes implementing large-scale data processing, has been a hot topic that is widely discussed through many studies. Many real-world tasks such as data processing for search engines can be parallel-implemented through a simple interface with two functions called Map and Reduce. We focus on comparing the performance of the Hadoop implementation of Map-Reduce with SQL Server through simulations. Hadoop can complete the same query faster than SQL Server. On the other hand, some concerned factors are also tested to see whether they would affect the performance for Hadoop or not. In fact more machines included for data processing can make Hadoop achieve a better performance, especially for a large-scale data set.
Publisher
Trans Tech Publications, Ltd.
Reference8 articles.
1. I. Shadi, J. Hai, L. Lu, L. Qi, S. Wu, and X. -H. Shi, Evaluating MapReduce on Virtual Machines: The Hadoop Case, in Proceedings of the 1st International Conference on Cloud Computing Beijing, China: Springer-Verlag, (2009).
2. L. -Q. Li, An optimistic differentiated service job scheduling system for Cloud Computing service users and providers, Qingdao, China, 2009, pp.295-299.
3. D. Jeffrey and G. Sanjay, MapReduce: simplified data processing on large clusters, Commun. ACM, vol. 51, pp.107-113, (2008).
4. J. L. Johnson, SQL in the Clouds, Computing in Science & Engineering, vol. 11, pp.12-28, (2009).
5. Y. -C. Tsay, Application of Java on Statistics Education, Department of Applied Mathematics, National Sun Yat-Sen University, Kaohsiung, Taiwan, July (2000).