Author:
Caro Felipe,Das Gupta Aparupa
Publisher
Springer Science and Business Media LLC
Subject
Management Science and Operations Research,General Decision Sciences
Reference30 articles.
1. Auer, P., Cesa-Bianchi, N., Freund, Y., & Schapire, R. E. (2003). The nonstochastic multiarmed bandit problem. SIAM Journal on Computing, 32(1), 48–77.
2. Bagnell, J. D., Ng, A. Y., & Schneider, J. (2001). Solving uncertain markov decision problems. Technical report, CMU-RI-TR-01-25, Pittsburgh, PA: Robotics Institute, Carnegie Mellon University.
3. Bertsekas, D. (2000). Dynamic programming and optimal control (Vol. II). Belmont, MA: Athena Scientific.
4. Besbes, O., Gur, Y., & Zeevi, A. (2014). Optimal exploration-exploitation in multi-armed-bandit problems with non-stationary rewards. Columbia Business School Working paper.
5. Burnetas, A. N., & Katehakis, M. N. (1996). Optimal adaptive policies for sequential allocation problems. Advances in Applied Mathematics, 17(2), 122–142.
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献