1. Alekh Agarwal , Daniel Hsu , Satyen Kale , John Langford , Lihong Li , and Robert E. Schapire . 2014 . Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits . In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32 (Beijing, China) (ICML’14). JMLR.org, II–1638–II–1646. Alekh Agarwal, Daniel Hsu, Satyen Kale, John Langford, Lihong Li, and Robert E. Schapire. 2014. Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32 (Beijing, China) (ICML’14). JMLR.org, II–1638–II–1646.
2. Shipra Agrawal and Navin Goyal . 2013 . Thompson Sampling for Contextual Bandits with Linear Payoffs . In Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28 (Atlanta, GA, USA) (ICML’13). JMLR.org, III–1220–III–1228. Shipra Agrawal and Navin Goyal. 2013. Thompson Sampling for Contextual Bandits with Linear Payoffs. In Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28 (Atlanta, GA, USA) (ICML’13). JMLR.org, III–1220–III–1228.
3. Jean-Yves Audibert and Sébastien Bubeck . 2010 . Best arm identification in multi-armed bandits . In COLT-23th Conference on Learning Theory-2010 . 13–p. Jean-Yves Audibert and Sébastien Bubeck. 2010. Best arm identification in multi-armed bandits. In COLT-23th Conference on Learning Theory-2010. 13–p.
4. Peter Auer Nicolo Cesa-Bianchi and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning 47(2002) 235–256. Peter Auer Nicolo Cesa-Bianchi and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning 47(2002) 235–256.
5. Identifying New Podcasts with High General Appeal Using a Pure Exploration Infinitely-Armed Bandit Strategy