Self-Learning Threshold-Based Load Balancing-Reference-Cited by-同舟云学术

Self-Learning Threshold-Based Load Balancing

Published:2021-09-16 Issue: Volume: Page:
ISSN:1091-9856
Container-title:INFORMS Journal on Computing
language:en
Short-container-title:INFORMS Journal on Computing

Author:

Goldsztajn Diego¹^ORCID,Borst Sem C.¹,van Leeuwaarden Johan S. H.²,Mukherjee Debankur³^ORCID,Whiting Philip A.⁴

Affiliation:

1. Eindhoven University of Technology, Eindhoven 5612 AZ, Netherlands;

2. Tilburg University, Tilburg 5037 AB, Netherlands;

3. Georgia Institute of Technology, Atlanta, Georgia 30332;

4. Macquarie University, Macquarie Park, New South Wales 2109, Australia

Abstract

We consider a large-scale service system where incoming tasks have to be instantaneously dispatched to one out of many parallel server pools. The user-perceived performance degrades with the number of concurrent tasks and the dispatcher aims at maximizing the overall quality of service by balancing the load through a simple threshold policy. We demonstrate that such a policy is optimal on the fluid and diffusion scales, while only involving a small communication overhead, which is crucial for large-scale deployments. In order to set the threshold optimally, it is important, however, to learn the load of the system, which may be unknown. For that purpose, we design a control rule for tuning the threshold in an online manner. We derive conditions that guarantee that this adaptive threshold settles at the optimal value, along with estimates for the time until this happens. In addition, we provide numerical experiments that support the theoretical results and further indicate that our policy copes effectively with time-varying demand patterns. Summary of Contribution: Data centers and cloud computing platforms are the digital factories of the world, and managing resources and workloads in these systems involves operations research challenges of an unprecedented scale. Due to the massive size, complex dynamics, and wide range of time scales, the design and implementation of optimal resource-allocation strategies is prohibitively demanding from a computation and communication perspective. These resource-allocation strategies are essential for certain interactive applications, for which the available computing resources need to be distributed optimally among users in order to provide the best overall experienced performance. This is the subject of the present article, which considers the problem of distributing tasks among the various server pools of a large-scale service system, with the objective of optimizing the overall quality of service provided to users. A solution to this load-balancing problem cannot rely on maintaining complete state information at the gateway of the system, since this is computationally unfeasible, due to the magnitude and complexity of modern data centers and cloud computing platforms. Therefore, we examine a computationally light load-balancing algorithm that is yet asymptotically optimal in a regime where the size of the system approaches infinity. The analysis is based on a Markovian stochastic model, which is studied through fluid and diffusion limits in the aforementioned large-scale regime. The article analyzes the load-balancing algorithm theoretically and provides numerical experiments that support and extend the theoretical results.

Publisher

Institute for Operations Research and the Management Sciences (INFORMS)

Subject

General Engineering

Reference30 articles.

1. Inverse problems in queueing theory and Internet probing

2. Quality of service and flow level admission control in the Internet

3. Insensitive load balancing

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Learning and Balancing Unknown Loads in Large-Scale Systems;Mathematics of Operations Research;2024-05-03