Incentivizing Exploration in Linear Contextual Bandits under Information Gap-Reference-Cited by-同舟云学术

Incentivizing Exploration in Linear Contextual Bandits under Information Gap

Published:2023-09-14 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the 17th ACM Conference on Recommender Systems
language:
Short-container-title:

Author:

Wang Huazheng¹^ORCID,Xu Haifeng²^ORCID,Li Chuanhao³^ORCID,Liu Zhiyuan⁴^ORCID,Wang Hongning³^ORCID

Affiliation:

1. Oregon State University, USA

2. University of Chicago, USA

3. University of Virginia, USA

4. University of Colorado,Boulder, USA

Funder

Army Research Office

NSF (National Science Foundation)

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3604915.3608794

Reference34 articles.

1. Yasin Abbasi-yadkori Dávid Pál and Csaba Szepesvári. 2011. Improved Algorithms for Linear Stochastic Bandits. In NIPS. 2312–2320. Yasin Abbasi-yadkori Dávid Pál and Csaba Szepesvári. 2011. Improved Algorithms for Linear Stochastic Bandits. In NIPS. 2312–2320.

2. Marc Abeille and Alessandro Lazaric. 2017. Linear thompson sampling revisited. In Artificial Intelligence and Statistics. PMLR 176–184. Marc Abeille and Alessandro Lazaric. 2017. Linear thompson sampling revisited. In Artificial Intelligence and Statistics. PMLR 176–184.

3. Priyank Agrawal and Theja Tulabandhula . 2020. Incentivising Exploration and Recommendations for Contextual Bandits with Payments . In Multi-Agent Systems and Agreement Technologies . Springer , 159–170. Priyank Agrawal and Theja Tulabandhula. 2020. Incentivising Exploration and Recommendations for Contextual Bandits with Payments. In Multi-Agent Systems and Agreement Technologies. Springer, 159–170.

4. Shipra Agrawal and Navin Goyal . 2013 . Thompson sampling for contextual bandits with linear payoffs . In International Conference on Machine Learning. PMLR, 127–135 . Shipra Agrawal and Navin Goyal. 2013. Thompson sampling for contextual bandits with linear payoffs. In International Conference on Machine Learning. PMLR, 127–135.

5. Using Confidence Bounds for Exploitation-Exploration Trade-offs;Auer Peter;Journal of Machine Learning Research,2002