1. Aaron W, Alan F, Prasad T (2014) Using trajectory data to improve bayesian optimization for reinforcement learning. J Mach Learn Res 15(8):253–282
2. Abeel P, Ng AY (2004) Apprenticeship learning via inverse reinforcement learning. In Proceedings of the twenty-first international conference on machine learning, pp 1–8. Association for Computing Machinery
3. Adam C, Pieter A, Andrew YN (2009) Apprenticeship learning for helicopter control. Commun ACM 52(7):97–105
4. Agogino AK, Tumer K (2004) Unifying temporal and structural credit assignment problems. In Proceedings of the third international joint conference on autonomous agents and multiagent systems–vol 2, pp 980–987. IEEE Computer Society
5. Al WA, Yun ID (2019) Partial policy-based reinforcement learning for anatomical landmark localization in 3d medical images. IEEE Trans Med Image