1. Yasin Abbasi-Yadkori, David Pal, and Csaba Szepesvari. 2012. Online-to-Confidence-Set Conversions and Application to Sparse Stochastic Bandits. In Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, Neil D. Lawrence and Mark Girolami (Eds.) (Proceedings of Machine Learning Research, Vol. 22). PMLR, La Palma, Canary Islands. 1–9.
2. Mohammad Gheshlaghi Azar, Ian Osband, and Rémi Munos. 2017. Minimax Regret Bounds for Reinforcement Learning. In Proceedings of the 34th International Conference on Machine Learning - Volume 70 (ICML’17). JMLR.org, 263–272.
3. James Bagnell, Sham M Kakade, Jeff Schneider, and Andrew Ng. 2003. Policy search by dynamic programming. Advances in neural information processing systems, 16 (2003).
4. Square-root lasso: pivotal recovery of sparse signals via conic programming
5. Simultaneous analysis of Lasso and Dantzig selector