Open Bandit Processes with Uncountable States and Time-Backward Effects-Reference-Cited by-同舟云学术

Open Bandit Processes with Uncountable States and Time-Backward Effects

Published:2013-06 Issue:02 Volume:50 Page:388-402
ISSN:0021-9002
Container-title:Journal of Applied Probability
language:en
Short-container-title:J. Appl. Probab.

Author:

Wu Xianyi,Zhou Xian

Abstract

Bandit processes and the Gittins index have provided powerful and elegant theory and tools for the optimization of allocating limited resources to competitive demands. In this paper we extend the Gittins theory to more general branching bandit processes, also referred to as open bandit processes, that allow uncountable states and backward times. We establish the optimality of the Gittins index policy with uncountably many states, which is useful in such problems as dynamic scheduling with continuous random processing times. We also allow negative time durations for discounting a reward to account for the present value of the reward that was received before the present time, which we refer to as time-backward effects. This could model the situation of offering bonus rewards for completing jobs above expectation. Moreover, we discover that a common belief on the optimality of the Gittins index in the generalized bandit problem is not always true without additional conditions, and provide a counterexample. We further apply our theory of open bandit processes with time-backward effects to prove the optimality of the Gittins index in the generalized bandit problem under a sufficient condition.

Publisher

Cambridge University Press (CUP)

Subject

Statistics, Probability and Uncertainty,General Mathematics,Statistics and Probability

Reference15 articles.

1. Open bandit processes and optimal scheduling of queueing networks

2. New results for generalized bandit problems