Author:
Lai Tze Leung,Ying Zhiliang
Abstract
Asymptotic approximations are developed herein for the optimal policies in discounted multi-armed bandit problems in which new projects are continually appearing, commonly known as ‘open bandit problems’ or ‘arm-acquiring bandits’. It is shown that under certain stability assumptions the open bandit problem is asymptotically equivalent to a closed bandit problem in which there is no arrival of new projects, as the discount factor approaches 1. Applications of these results to optimal scheduling of queueing networks are given. In particular, Klimov&s priority indices for scheduling queueing networks are shown to be limits of the Gittins indices for the associated closed bandit problem, and extensions of Klimov&s results to preemptive policies and to unstable queueing systems are given.
Publisher
Cambridge University Press (CUP)
Subject
Applied Mathematics,Statistics and Probability
Cited by
42 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献