Keddah-Reference-Cited by-同舟云学术

Keddah

Published:2019-07-27 Issue:3 Volume:29 Page:1-25
ISSN:1049-3301
Container-title:ACM Transactions on Modeling and Computer Simulation
language:en
Short-container-title:ACM Trans. Model. Comput. Simul.

Author:

Deng Jie¹,Tyson Gareth¹,Cuadrado Felix¹,Uhlig Steve¹

Affiliation:

1. School of Electronic Engineering and Computer Science, Queen Mary University of London, London, UK

Abstract

As a distributed system, Hadoop heavily relies on the network to complete data-processing jobs. While the traffic generated by Hadoop jobs is critical for job execution performance, the actual behaviour of Hadoop network traffic is still poorly understood. This lack of understanding greatly complicates research relying on Hadoop workloads. In this article, we explore Hadoop traffic through empirical traces. We analyse the generated traffic of multiple types of MapReduce jobs, with varying input sizes, and cluster configuration parameters. We present Keddah, a toolchain for capturing, modelling, and reproducing Hadoop traffic, for use with network simulators to better capture the behaviour of Hadoop. By imitating the Hadoop traffic generation process and considering the YARN resource allocation, Keddah can be used to create Hadoop traffic workloads, enabling reproducible Hadoop research in more realistic scenarios.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Science Applications,Modeling and Simulation

Link

https://dl.acm.org/doi/pdf/10.1145/3301503

Reference41 articles.

1. Apache Software Foundation. 2017. The Apache Mahout project. Retrieved from http://mahout.apache.org/. Apache Software Foundation. 2017. The Apache Mahout project. Retrieved from http://mahout.apache.org/.

2. A scalable, commodity data center network architecture

3. Data center TCP (DCTCP)

4. Quantitative comparisons of the state-of-the-art data center architectures

5. The Case for Evaluating MapReduce Performance Using Workload Suites

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Applying improved K-means algorithm into official service vehicle networking environment and research;Soft Computing;2020-04-03