1. Human-level control through deep reinforcement learning;Mnih;Nature,2015
2. Reinforcement learning in robotics: A survey;Kober;Int. J. Robot. Res.,2013
3. Mirowski, P., Pascanu, R., Viola, F., Soyer, H., Ballard, A., Banino, A., Denil, M., Goroshin, R., Sifre, L., and Kavukcuoglu, K. (2016). Learning to navigate in complex environments. arXiv.
4. Casper, S., Davies, X., Shi, C., Gilbert, T.K., Scheurer, J., Rando, J., Freedman, R., Korbak, T., Lindner, D., and Freire, P. (2023). Open problems and fundamental limitations of reinforcement learning from human feedback. arXiv.
5. Christiano, P.F., Leike, J., Brown, T., Martic, M., Legg, S., and Amodei, D. (2017, January 4–9). Deep reinforcement learning from human preferences. Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.