1. Anselmi, J., Gaujal, B., Rebuffi, L.-S.: Optimal speed profile of a DVFS processor under soft deadlines. In: Performance Evaluation. Elsevier (2021). (to appear)
2. Asanjarani, A., Nazarathy, Y., Taylor, P.: A survey of parameter and state estimation in queues. Queueing Syst. 97, 39–80 (2021)
3. Azar, M.G., Osband, I., Munos, R.: Minimax regret bounds for reinforcement learning. In: International Conference on Machine Learning, pp. 263–272. PMLR (2017)
4. Gast, N., Gaujal, B., Khun, K.: Reinforcement learning for Markovian bandits: is posterior sampling more scalable than optimism? Technical Report hal-03262006, HAL-Inria (June 2021)
5. Jaksch, T., Ortner, R., Auer, P.: Near-optimal regret bounds for reinforcement learning. J. Mach. Learn. Res. 11(4), 1563–1600 (2010)