ASYMPTOTIC BAYES ANALYSIS FOR THE FINITE-HORIZON
ONE-ARMED-BANDIT PROBLEM
-
Published:2003-01
Issue:1
Volume:17
Page:53-82
-
ISSN:0269-9648
-
Container-title:Probability in the Engineering and Informational Sciences
-
language:en
-
Short-container-title:Prob. Eng. Inf. Sci.
Author:
Burnetas Apostolos N.,Katehakis Michael N.
Abstract
The multiarmed-bandit problem is often taken as a basic model
for the trade-off between the exploration and utilization required
for efficient optimization under uncertainty. In this article,
we study the situation in which the unknown performance of a
new bandit is to be evaluated and compared with that of a known
one over a finite horizon. We assume that the bandits represent
random variables with distributions from the one-parameter
exponential family. When the objective is to maximize the Bayes
expected sum of outcomes over a finite horizon, it is shown
that optimal policies tend to simple limits when the length
of the horizon is large.
Publisher
Cambridge University Press (CUP)
Subject
Industrial and Manufacturing Engineering,Management Science and Operations Research,Statistics, Probability and Uncertainty,Statistics and Probability
Cited by
57 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献