1. Audibert, J.-Y., Munos, R., Szepesvári, C.: Use of variance estimation in the multi-armed bandit problem. In: NIPS 2006 Workshop on On-line Trading of Exploration and Exploitation (2006)
2. Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2/3), 235–256 (2002)
3. Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: Gambling in a rigged casino: the adversarial multi-armed bandit problem. In: Proceedings of the 36th Annual Symposium on Foundations of Computer Science, pp. 322–331. IEEE Computer Society, Los Alamitos (1995)
4. Azoulay-Schwartz, R., Kraus, S., Wilkenfeld, J.: Exploitation vs. exploration: choosing a supplier in an environment of incomplete information. Decis. Support Syst. 38(1), 1–18 (2004)
5. Bertsekas, D.P.: Dynamic Programming—Deterministic and Stochastic Models. Prentice-Hall, Upper Saddle River (1987)