Affiliation:
1. Politecnico di Milano, Dipartimento di Elettronica, Informazione e Bioingegneria, Milano, Italy
Abstract
Big Data applications allow to successfully analyze large amounts of data not necessarily structured, though at the same time they present new challenges. For example, predicting the performance of frameworks such as Hadoop and Spark can be a costly task, hence the necessity to provide models that can be a valuable support for designers and developers. Big Data systems are becoming a central force in society and the use of models can also enable the development of intelligent systems providing Quality of Service (QoS) guarantees to their users through runtime system reconfiguration. This paper provides a new contribution in studying a novel modeling approach based on fluid Petri nets to predict MapReduce and Spark applications execution time which is suitable for runtime performance prediction. Models have been validated by an extensive experimental campaign performed at CINECA, the Italian supercomputing center, and on the Microsoft Azure HDInsight data platform. Results have shown that the achieved accuracy is around 9.5% for Map Reduce and about 10% for Spark of the actual measurements on average.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications,Hardware and Architecture,Software
Reference36 articles.
1. L.
Aguilera-Mendoza
and
M. T.
Llorente-Quesada
.
Modeling and simulation of Hadoop Distributed File System in a cluster of workstations. In A. Cuzzocrea and S. Maabout editors Model and Data Engineering volume
8216
of
Lecture Notes in Computer Science pages
1
--
12
.
Springer Berlin Heidelberg 2013
. 10.1007/978-3-642-41366-7_1 L. Aguilera-Mendoza and M. T. Llorente-Quesada. Modeling and simulation of Hadoop Distributed File System in a cluster of workstations. In A. Cuzzocrea and S. Maabout editors Model and Data Engineering volume 8216 of Lecture Notes in Computer Science pages 1--12. Springer Berlin Heidelberg 2013. 10.1007/978-3-642-41366-7_1
2. Modeling Performance of Hadoop Applications: A Journey from Queueing Networks to Stochastic Well Formed Nets
3. Rethinking the Use of Models in Software Architecture
4. Methodological construction of product-form stochastic Petri nets for performance evaluation
Cited by
13 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献