1. Agrawal, S., Goyal, N. (2012) Analysis of thompson sampling for the multi-armed bandit problem. In: Mannor, S., Srebro, N., Williamson, R.C. (eds.) Proceedings of the 25th Annual Conference on Learning Theory. Proceedings of Machine Learning Research. vol. 23, pp. 39–13926. PMLR, Edinburgh, Scotland.
2. Åkerblom, N., Chen, Y., Haghir Chehreghani, M. (2020) An online learning framework for energy-efficient navigation of electric vehicles. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI), pp. 2051–2057. 10.24963/ijcai.2020/284
3. Auer, P. (2003). Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research., 3(null), 397–422.
4. Batagelj, V., Mrvar, A. (2006) Pajek datasets. http://vlado.fmf.uni-lj.si/pub/networks/data/. Accessed: 2021-09-08
5. Beebe, N.H.F. (2002) Nelson H. F. Beebe’s Bibliographies Page. http://www.math.utah.edu/~beebe/bibliographies.html. Accessed: 2021-09-08