On an index policy for restless bandits-Reference-Cited by-同舟云学术

On an index policy for restless bandits

Published:1990-09 Issue:3 Volume:27 Page:637-648
ISSN:0021-9002
Container-title:Journal of Applied Probability
language:en
Short-container-title:Journal of Applied Probability

Author:

Weber Richard R.,Weiss Gideon

Abstract

We investigate the optimal allocation of effort to a collection of n projects. The projects are ‘restless' in that the state of a project evolves in time, whether or not it is allocated effort. The evolution of the state of each project follows a Markov rule, but transitions and rewards depend on whether or not the project receives effort. The objective is to maximize the expected time-average reward under a constraint that exactly m of the n projects receive effort at any one time. We show that as m and n tend to ∞ with m/n fixed, the per-project reward of the optimal policy is asymptotically the same as that achieved by a policy which operates under the relaxed constraint that an average of m projects be active. The relaxed constraint was considered by Whittle (1988) who described how to use a Lagrangian multiplier approach to assign indices to the projects. He conjectured that the policy of allocating effort to the m projects of greatest index is asymptotically optimal as m and n tend to∞. We show that the conjecture is true if the differential equation describing the fluid approximation to the index policy has a globally stable equilibrium point. This need not be the case, and we present an example for which the index policy is not asymptotically optimal. However, numerical work suggests that such counterexamples are extremely rare and that the size of the suboptimality which one might expect is minuscule.

Publisher

Cambridge University Press (CUP)

Subject

Statistics, Probability and Uncertainty,General Mathematics,Statistics and Probability

Reference4 articles.

1. Random Perturbations of Dynamical Systems

2. Mitra D. and Weiss A. (1988) A fluid limit of a closed queueing network with applications to data networks.

3. Restless bandits: activity allocation in a changing world

Cited by 274 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Low-complexity algorithm for restless bandits with imperfect observations;Mathematical Methods of Operations Research;2024-09-05

2. A Restless Bandit Model for Energy-Efficient Job Assignments in Server Farms;IEEE Transactions on Automatic Control;2024-09

3. Tabular and Deep Learning for the Whittle Index;ACM Transactions on Modeling and Performance Evaluation of Computing Systems;2024-08-13

4. A restless bandit approach for capacitated condition based maintenance scheduling;Flexible Services and Manufacturing Journal;2024-06-13

5. An Easier-to-Verify Sufficient Condition for Whittle Indexability and Application to AoI Minimization;IEEE INFOCOM 2024 - IEEE Conference on Computer Communications;2024-05-20