Optimizing the Performance of Clouds Using Hash Codes in Apache Hadoop and Spark-Reference-Cited by-同舟云学术

Optimizing the Performance of Clouds Using Hash Codes in Apache Hadoop and Spark

Published:2019 Issue:6 Volume:54 Page:
ISSN:0258-2724
Container-title:Journal of Southwest Jiaotong University
language:en
Short-container-title:

Author:

Al-Fatlawi Ahmed Abdul Hassan,Mohammed Ghassan N.,Al Barazanchi Israa

Abstract

Hash functions are an integral part of MapReduce software, both in Apache Hadoop and Spark. If the hash function performs badly, the load in the reduced part will not be balanced and access times will spike. To investigate this problem further, we ran the Wordcount program with numerous different hash functions on Amazon AWS. In particular, we will leverage the Amazon Elastic MapReduce framework. The paper investigates the general purpose, cryptographic, checksum, and special hash functions. Through the analysis, we present the corresponding runtime results.

Publisher

Southwest Jiaotong University

Subject

Multidisciplinary

Reference36 articles.

1. HADOOP, A. (2018) Hadoop. [Online] Available from: http://hadoop.apache.org [Accessed 17/09/19].

2. SPARK, A. (2016) Apache spark: Lightning-fast cluster computing. [Online] Available from: https://sur.ly/o/spark.apache.org/AA000014 [Accessed 17/09/19].

3. BIANCHINI, M., GORI, M., and SCARSELLI, F. (2005) Inside pagerank. ACM Transactions on Internet Technology, 5 (1), pp. 92-128.

4. HE, B., FANG, W., LUO, Q., GOVINDARAJU, N.K., and WANG, T. (2008) Mars: a MapReduce framework on graphics processors. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, Toronto, October 2008. New York: Association for Computing Machinery, pp. 260-269.

5. KATSOULIS, S. (2011) Implementation of Parallel Hash Join Algorithms over Hadoop. Edinburgh: University of Edinburgh.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Unified Addressing Algorithm for Tasks in Cloud Control Systems;2022 41st Chinese Control Conference (CCC);2022-07-25