1. Shipra Agrawal and Navin Goyal. 2012. Analysis of thompson sampling for the multi-armed bandit problem. In Conference on learning theory. JMLR Workshop and Conference Proceedings, 39–1.
2. Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning 47 (2002), 235–256.
3. Soumya Basu, Karthik Abinav Sankararaman, and Abishek Sankararaman. 2021. Beyond Math 16 regret for decentralized bandits in matching markets. In International Conference on Machine Learning. PMLR, 705–715.
4. Simina Branzei and Yuval Peres. 2021. Multiplayer Bandit Learning, from Competition to Cooperation. In Proceedings of Thirty Fourth Conference on Learning Theory(Proceedings of Machine Learning Research, Vol. 134), Mikhail Belkin and Samory Kpotufe (Eds.). 679–723. https://proceedings.mlr.press/v134/branzei21a.html
5. Sarah H Cen and Devavrat Shah. 2022. Regret, stability & fairness in matching markets with bandit learners. In International Conference on Artificial Intelligence and Statistics. PMLR, 8938–8968.