Abstract
The aim of this paper is to evaluate the performance of the optimal policy (the Gittins index policy) for open tax problems of the type considered by Klimov in the undiscounted limit. In this limit, the state-dependent part of the cost is linear in the state occupation numbers for the multi-armed bandit, but is quadratic for the tax problem. The discussion of the passage to the limit for the tax problem is believed to be largely new; the principal novelty is our evaluation of the matrix of the quadratic form. These results are confirmed by a dynamic programming analysis, which also suggests how the optimal policy should be modified when resources can be freely deployed only within workstations, rather than system-wide.
Publisher
Cambridge University Press (CUP)
Subject
Statistics, Probability and Uncertainty,General Mathematics,Statistics and Probability
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. On the Gittins index for multistage jobs;Queueing Systems;2022-04-07
2. References;Multi-Armed Bandit Allocation Indices;2011-02-16
3. Klimov's Model;Wiley Encyclopedia of Operations Research and Management Science;2011-01-14
4. Appendices;Foundations and Applications of Sensor Management;2008
5. Comments on: Dynamic priority allocation via restless bandit marginal productivity indices;TOP;2007-10-03