1. Auer, P., Cesa-Bianchi, N., Freund, Y., & Schapire, R. E. (1995). Gambling in a rigged casino: The adversarial multi-arm bandit problem. In Proceedings of the Annual Symposium on Foundations of Computer Science (FOCS) (pp. 322–331).
2. Aumann, R. (1974). Subjectivity and correlation in randomized strategies. Journal of Mathematical Economics, 1, 67–96.
3. Banerjee, B., & Peng, J. (2004). Performance bounded reinforcement learning in strategic interactions. In Proceedings of the National Conference on Artificial Intelligence (AAAI) (pp. 2–7). San Jose, CA, USA.
4. Banerjee, B., Sen, S., & Peng, J. (2001). Fast concurrent reinforcement learners. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI) (pp. 825–830). Seattle, WA.
5. Bowling, M. (2005). Convergence and no-regret in multiagent learning. In Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS) (pp. 209–216). Vancouver, Canada.