1. Agrawal Shipra, Goyal Navin (2013). Thompson sampling for contextual bandits with linear payoffs. International Conference on Machine Learning: 127–135.
2. Ali S Nageeb, Lewis Greg, Vasserman Shoshanatitle (2020). Voluntary disclosure and personalized pricing. Proceedings of the 21st ACM Conference on Economics and Computation: 537–538.
3. Araman Victor F, Caldentey Ren (2009). Dynamic pricing for nonperishable products with demand learning. Operations Research 57(5):1169–1188.
4. Auer P, Cesa-Bianchi N, Freund Y, Schapire RE (2002). The nonstochastic multiarmed bandit problem. SIAM Journal on Computing 32(1):48–77.
5. Badanidiyuru Ashwinkumar, Langford John, Slivkins Aleksandrs (2014). Resourceful contextual bandits. Conference on Learning Theory: 1109–1134.