1. Kareem Amin, Michael Kearns, Peter Key, and Anton Schwaighofer. 2012. Budget optimization for sponsored search: Censored learning in mdps. arXiv preprint arXiv:1210.4847 (2012).
2. Learning dexterous in-hand manipulation
3. Incrementality Bidding via Reinforcement Learning under Mixed and Delayed Rewards;Varadaraja Ashwinkumar Badanidiyuru;Advances in Neural Information Processing Systems,2022
4. Santiago Balseiro, Negin Golrezaei, Mohammad Mahdian, Vahab Mirrokni, and Jon Schneider. 2019. Contextual bandits with cross-learning. Advances in Neural Information Processing Systems, Vol. 32 (2019).
5. Development of the PID controller