UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem-Reference-Cited by-同舟云学术

UCB revisited: Improved regret bounds for the stochastic multi-armed bandit problem

Published:2010-09 Issue:1-2 Volume:61 Page:55-65
ISSN:0031-5303
Container-title:Periodica Mathematica Hungarica
language:en
Short-container-title:Period Math Hung

Author:

Auer Peter,Ortner Ronald

Publisher

Springer Science and Business Media LLC

Subject

General Mathematics

Link

http://link.springer.com/content/pdf/10.1007/s10998-010-3055-6.pdf

Reference11 articles.

1. Rajeev Agrawal, Sample mean based index policies with O(log n) regret for the multi-armed bandit problem, Adv. in Appl. Probab., 27 (1995), 1054–1078.

2. Jean-Yves Audibert and Sébastien Bubeck, Minimax policies for adversarial and stochastic bandits, Proceedings of the 22nd Annual Conference on Learning Theory (COLT2009), 2009, 217–226.

3. Jean-Yves Audibert, Rémi Munos and Csaba Szepesvári, Exploration-exploitation tradeoff using variance estimates in multi-armed bandits, Theor. Comput. Sci., 410 (2009), 1876–1902.

4. Peter Auer, Nicolò Cesa-Bianchi and Paul Fischer, Finite-Time Analysis of the Multi-Armed Bandit Problem, Mach. Learn., 47 (2002), 235–256.

5. Peter Auer, Nicolò Cesa-Bianchi, Yoav Freund and Robert E. Schapire, The Nonstochastic Multiarmed Bandit Problem, SIAM J. Comput., 32 (2002), 48–77.

Cited by 125 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Adapting bandit algorithms for settings with sequentially available arms;Engineering Applications of Artificial Intelligence;2024-05

2. Distributed Data-Driven Learning-Based Optimal Dynamic Resource Allocation for Multi-RIS-Assisted Multi-User Ad-Hoc Network;Algorithms;2024-01-19

3. Batched Neural Bandits;ACM / IMS Journal of Data Science;2024-01-16

4. Nearly Minimax-Optimal Regret for Linearly Parameterized Bandits;IEEE Transactions on Information Theory;2024-01

5. Weak Signal Asymptotics for Sequentially Randomized Experiments;Management Science;2023-12-06