Subject
Artificial Intelligence,Cognitive Neuroscience,Computer Science Applications
Reference37 articles.
1. Sample mean based index policies with o(log n) regret for the multi-armed bandit problem;Agrawal;Adv. Appl. Probab.,1995
2. Analysis of thompson sampling for the multi-armed banditproblem;Agrawal;Proc. Conf. Learning Theory,2012
3. Bandit-based local feature subset selection;Ashtiani;Neurocomput.,2014
4. Regret bounds and minimax policies under partial monitoring;Audibert;J. Mach. Learn. Res.,2010
5. Exploration-exploitation trade-off using variance estimates in multi-armed bandits;Audibert;Theor. Comput. Sci.,2009
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献