1. Audibert, J.-Y., Bubeck, S.: Minimax policies for adversarial and stochastic bandits. In: Proceedings of the Annual Conference on Learning Theory (COLT) (2009)
2. Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: Gambling in a rigged casino: the adversarial multi-armed bandit problem. In: Proceedings of the 36th Annual Symposium on Foundations of Computer Science, pp. 322–331. IEEE Computer Society Press, Los Alamitos (1995)
3. Bouzy, B., Métivier, M.: Multi-agent learning experiments on repeated matrix games. In: ICML, pp. 119–126 (2010)
4. Lecture Notes in Computer Science;R. Coulom,2007
5. Grigoriadis, M.D., Khachiyan, L.G.: A sublinear-time randomized approximation algorithm for matrix games. Operations Research Letters 18(2), 53–58 (1995)