1. Auer, P., Jaksch, T., Ortner, R.: Near-optimal regret bounds for reinforcement learning. J. Mach. Learn. Res. 99, 1563–1600 (2010)
2. Azar, M., Munos, R., Kappen, B.: On the sample complexity of reinforcement learning with a generative model. In: Proceedings of the 29th International Conference on Machine Learning. ACM, New York (2012)
3. Auer, P., Ortner, R.: Logarithmic online regret bounds for undiscounted reinforcement learning. In: Advances in Neural Information Processing Systems 19, pp. 49–56. MIT Press (2007)
4. Auer, P.: Upper confidence reinforcement learning. Unpublished, keynote at European Workshop of Reinforcement Learning (2011)
5. Chung, F., Lu, L.: Concentration inequalities and martingale inequalities a survey. Internet Mathematics 3, 1 (2006)