1. A new look at dynamic regret for non-stationary stochastic bandits;Abbasi-Yadkori;Journal of Machine Learning Research,2023
2. State abstraction as compression in apprenticeship learning;Abel,2019
3. State abstractions for lifelong reinforcement learning;Abel,2018
4. Flambe: Structural complexity and representation learning of low rank mdps;Agarwal;Advances in Neural Information Processing Systems,2020
5. Analysis of thompson sampling for the multi-armed bandit problem;Agrawal,2012