Predictive topology refinements in distributed stream processing system-Reference-Cited by-同舟云学术

Predictive topology refinements in distributed stream processing system

Published:2020-11-05 Issue:11 Volume:15 Page:e0240424
ISSN:1932-6203
Container-title:PLOS ONE
language:en
Short-container-title:PLoS ONE

Author:

Hanif Muhammad,Lee Choonhwa^ORCID,Helal Sumi

Abstract

Cloud computing has evolved the big data technologies to a consolidated paradigm with SPaaS (Streaming processing-as-a-service). With a number of enterprises offering cloud-based solutions to end-users and other small enterprises, there has been a boom in the volume of data, creating interest of both industry and academia in big data analytics, streaming applications, and social networking applications. With the companies shifting to cloud-based solutions as a service paradigm, the competition grows in the market. Good quality of service (QoS) is a must for the enterprises, as they strive to survive in a competitive environment. However, achieving reasonable QoS goals to meet SLA agreement cost-effectively is challenging due to variation in workload over time. This problem can be solved if the system has the ability to predict the workload for the near future. In this paper, we present a novel topology-refining scheme based on a workload prediction mechanism. Predictions are made through a model based on a combination of SVR, autoregressive, and moving average model with a feedback mechanism. Our streaming system is designed to increase the overall performance by making the topology refining robust to the incoming workload on the fly, while still being able to achieve QoS goals of SLA constraints. Apache Flink distributed processing engine is used as a testbed in the paper. The result shows that the prediction scheme works well for both workloads, i.e., synthetic as well as real traces of data.

Funder

National Research Foundation Korea

Institute of Information & communications Technology Planning & Evaluation

Publisher

Public Library of Science (PLoS)

Subject

Multidisciplinary

Reference56 articles.

1. Lam W, Liu L, Prasad S, Rajaraman A, Vacheri Z, Doan A. Muppet: MapReduce-style Processing of Fast Data. Proc VLDB Endow. 2012. https://doi.org/10.14778/2367502.2367520

2. MillWheel: Fault-Tolerant Stream Processing at Internet Scale;T Akidu;Proc VLDB Endow,2013

3. Toshniwal A, Taneja S, Shukla A, Ramasamy K, Patel JM, Kulkarni S, et al. Storm @ Twitter. 2014; 147–156.

4. Discretized Streams: Fault-Tolerant Streaming Computation at Scale;M Zaharia;Sosp,2013

5. Borthakur D, Rash S, Schmidt R, Aiyer A, Gray J, Sarma J Sen, et al. Apache hadoop goes realtime at Facebook. Proceedings of the 2011 international conference on Management of data—SIGMOD’11. 2011. p. 1071. https://doi.org/10.1145/1989323.1989438

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A streaming data prediction method based on long short-term memory model and grey model;2021 International Conference on Neural Networks, Information and Communication Engineering;2021-10-15