A class of bandit problems yielding myopic optimal strategies-Reference-Cited by-同舟云学术

A class of bandit problems yielding myopic optimal strategies

Published:1992-09 Issue:3 Volume:29 Page:625-632
ISSN:0021-9002
Container-title:Journal of Applied Probability
language:en
Short-container-title:Journal of Applied Probability

Author:

Banks Jeffrey S.,Sundaram Rangarajan K.

Abstract

We consider the class of bandit problems in which each of the n ≧ 2 independent arms generates rewards according to one of the same two reward distributions, and discounting is geometric over an infinite horizon. We show that the dynamic allocation index of Gittins and Jones (1974) in this context is strictly increasing in the probability that an arm is the better of the two distributions. It follows as an immediate consequence that myopic strategies are the uniquely optimal strategies in this class of bandit problems, regardless of the value of the discount parameter or the shape of the reward distributions. Some implications of this result for bandits with Bernoulli reward distributions are given.

Publisher

Cambridge University Press (CUP)

Subject

Statistics, Probability and Uncertainty,General Mathematics,Statistics and Probability

Reference9 articles.

1. Bandit problems

2. Contributions to the "Two-Armed Bandit" Problem

3. Some results on two-armed bandits when both projects vary

Cited by 13 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A confirmation of a conjecture on Feldman’s two-armed bandit problem;Journal of Applied Probability;2023-05-26

2. A central limit theorem, loss aversion and multi-armed bandits;Journal of Economic Theory;2023-04

3. Endogenous growth model with Bayesian learning and technology selection;Mathematical Social Sciences;2021-11

4. Dynamic survival bias in optimal stopping problems;Journal of Economic Theory;2021-09

5. Dynamic project selection;Theoretical Economics;2018-01