Thompson Sampling with Unrestricted Delays-Reference-Cited by-同舟云学术

Thompson Sampling with Unrestricted Delays

Published:2022-07-12 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the 23rd ACM Conference on Economics and Computation
language:
Short-container-title:

Author:

Wu Han¹,Wager Stefan¹

Affiliation:

1. Stanford University, Stanford, CA, USA

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3490486.3538376

Reference34 articles.

1. Shipra Agrawal and Navin Goyal . 2012 . Analysis of Thompson Sampling for the Multi-armed Bandit Problem . In Proceedings of the 25th Annual Conference on Learning Theory. 39 .1--39.26. Shipra Agrawal and Navin Goyal. 2012. Analysis of Thompson Sampling for the Multi-armed Bandit Problem. In Proceedings of the 25th Annual Conference on Learning Theory. 39.1--39.26.

2. Shipra Agrawal and Navin Goyal . 2013 a. Further Optimal Regret Bounds for Thompson Sampling . In Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics. 99--107 . Shipra Agrawal and Navin Goyal. 2013a. Further Optimal Regret Bounds for Thompson Sampling. In Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics. 99--107.

3. Shipra Agrawal and Navin Goyal . 2013 b. Thompson Sampling for Contextual Bandits with Linear Payoffs . In Proceedings of the 30th International Conference on Machine Learning. 127--135 . Shipra Agrawal and Navin Goyal. 2013b. Thompson Sampling for Contextual Bandits with Linear Payoffs. In Proceedings of the 30th International Conference on Machine Learning. 127--135.

4. Near-Optimal Regret Bounds for Thompson Sampling

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Learning Classifiers under Delayed Feedback with a Time Window Assumption;Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2022-08-14