Abstract
Stoppable families of alternative bandit processes are decision processes with the property that at each decision epoch the choice is between allocating service to one of the constituent bandit processes or stopping and deciding in favour of one of them. The problem is considered of finding optimal (or good suboptimal) strategies for such processes. The theory for non-stoppable families leads us to study the performance of a simple strategy. This is shown to be optimal under certain conditions. These conditions are discussed and an example relating to research planning is given.
Publisher
Cambridge University Press (CUP)
Subject
Statistics, Probability and Uncertainty,General Mathematics,Statistics and Probability
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Optimal Learning Before Choice;SSRN Electronic Journal;2016
2. On a reduction principle in dynamic programming;Advances in Applied Probability;1988-12
3. Optimal Search in Negotiation Analysis;Journal of Conflict Resolution;1985-09
4. On a sufficient condition for superprocesses due to whittle;Journal of Applied Probability;1982-03
5. Discussion of Dr Gittins' Paper;Journal of the Royal Statistical Society: Series B (Methodological);1979-01