1. Abernethy, J., Hazan, E., Rakhlin, A.: Competing in the dark: an efficient algorithm for bandit linear optimization. In: Proceedings of the 21st Conference on Learning Theory, pp. 263–274 (2008)
2. Agarwal, A., Dekel, O., Xiao, L.: Optimal algorithms for online convex optimization with multi-point bandit feedback. In: Proceedings of the 23rd Conference on Learning Theory, pp. 28–40 (2010)
3. Awerbuch, B., Kleinberg, R.: Online linear optimization and adaptive routing. J. Comput. Syst. Sci. 74(1), 97–114 (2008)
4. Bubeck, S., Cesa-Bianchi, N., Kakade, S.M.: Towards minimax policies for online linear optimization with bandit feedback. In: Proceedings of the 25th Conference on Learning Theory, pp. 41.1–41.14 (2012)
5. Bubeck, S., Dekel, O., Koren, T., Peres, Y.: Bandit convex optimization: $$\sqrt{t}$$ regret in one dimension. In: Proceedings of the 28th Conference on Learning Theory, pp. 266–278 (2015)