Abstract
AbstractModern data centers have widely deployed lots of cluster computing applications such as MapReduce and Spark. Since the coflow/task abstraction can exactly express the requirements of cluster computing applications, various task-based solutions have been proposed to improve application-level performance. However, most of solutions require modification of the applications to obtain task information, making them impractical in many scenarios. In this paper, we propose a Bayesian decision-based Task Prediction mechanism named BTP to identify task and predict the task-size category. First, we design an automatic identification mechanism to identify tasks without manually modifying the applications. Then we leverage bayesian decision to predict the task-size category. Through a series of large-scale NS2 simulations, we demonstrate that BTP can accurately identify task and predict the task-size category. More specifically, BTP achieves 96% precision and 92% recall while obtaining accuracy by up to 98%.
Funder
national natural science foundation of china
Publisher
Springer Science and Business Media LLC
Subject
Computer Networks and Communications,Software
Reference44 articles.
1. Dogar FR, Karagiannis T, Ballani H, Rowstron A (2014) Decentralized task-aware scheduling for data center networks In: Proc. ACM SIGCOMM, New York 431–442.
2. Liu S, Huang J, Zhou Y, Wang J, He T (2019) Task-aware TCP in data center networks. IEEE/ACM Trans Networking 27(1):389–404.
3. Bai W, Chen L, Chen K, et al. (2017) PIAS: Practical information-agnostic flow scheduling for commodity data centers. IEEE/ACM Trans Networking 25(4):1954–1967.
4. Yuille AL, Bthoff HH (1996) Bayesian decision theory and psychophysics. Percept Bayesian Infer 11(4):123–161.
5. Huang J, Huang Y, Wang J, He T (2020) Adjusting packet size to mitigate TCP Incast in data center networks with COTS switches. IEEE Trans Cloud Comput 8(3):749–763.