1. Taming the monster: A fast and simple algorithm for contextual bandits;Agarwal,2014
2. Thompson sampling for contextual bandits with linear payoffs;Agrawal,2013
3. Offline contextual multi-armed bandits for mobile health interventions: A case study on emotion regulation;Ameko,2020
4. On the model-based stochastic value gradient for continuous reinforcement learning;Amos,2021
5. Partner selection for the emergence of cooperation in multi-agent systems using reinforcement learning;Anastassacos;Proceedings of the AAAI conference on artificial intelligence,2020