On the asymptotic optimality of greedy index heuristics for multi-action restless bandits-Reference-Cited by-同舟云学术

On the asymptotic optimality of greedy index heuristics for multi-action restless bandits

Published:2015-09 Issue:3 Volume:47 Page:652-667
ISSN:0001-8678
Container-title:Advances in Applied Probability
language:en
Short-container-title:Advances in Applied Probability

Author:

Hodge D. J.,Glazebrook K. D.

Abstract

The class of restless bandits as proposed by Whittle (1988) have long been known to be intractable. This paper presents an optimality result which extends that of Weber and Weiss (1990) for restless bandits to a more general setting in which individual bandits have multiple levels of activation but are subject to an overall resource constraint. The contribution is motivated by the recent works of Glazebrook et al. (2011a), (2011b) who discussed the performance of index heuristics for resource allocation in such systems. Hitherto, index heuristics have been shown, under a condition of full indexability, to be optimal for a natural Lagrangian relaxation of such problems in which a resource is purchased rather than constrained. We find that under key assumptions about the nature of solutions to a deterministic differential equation that the index heuristics above are asymptotically optimal in a sense described by Whittle. We then demonstrate that these assumptions always hold for three-state bandits.

Publisher

Cambridge University Press (CUP)

Subject

Applied Mathematics,Statistics and Probability

Reference21 articles.

1. Dynamic Assortment with Demand Learning for Seasonal Consumer Goods

2. Index policies for a class of discounted restless bandits

3. Comments on: Dynamic priority allocation via restless bandit marginal productivity indices

4. Outsourcing warranty repairs: Dynamic allocation

Cited by 11 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A restless bandit model for dynamic ride matching with reneging travelers;European Journal of Operational Research;2024-07

2. Exponential asymptotic optimality of Whittle index policy;Queueing Systems;2023-05-21

3. Learning and Communications Co-Design for Remote Inference Systems: Feature Length Selection and Transmission Scheduling;IEEE Journal on Selected Areas in Information Theory;2023

4. Index-aware reinforcement learning for adaptive video streaming at the wireless edge;Proceedings of the Twenty-Third International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing;2022-10-03

5. Optimal Myopic Policy for Restless Bandit: A Perspective of Eigendecomposition;IEEE Journal of Selected Topics in Signal Processing;2022-04