1. Jacob Duncan Abernethy, Elad Hazan, and Alexander Rakhlin. 2008. An efficient algorithm for bandit linear optimization. In Conference on Learning Theory.
2. Mirror descent and nonlinear projected subgradient methods for convex optimization
3. Noam Brown, Adam Lerer, Sam Gross, and Tuomas Sandholm. 2019. Deep counterfactual regret minimization. In International conference on machine learning. PMLR, 793–802.
4. Noam Brown and Tuomas Sandholm. 2016. Strategy-based warm starting for regret minimization in games. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 30.
5. Revisiting CFR+ and Alternating Updates