1. Achiam, J., Held, D., Tamar, A., Abbeel, P.: Constrained policy optimization. In: International Conference on Machine Learning (2017)
2. Alshiekh, M., Bloem, R., Ehlers, R., Könighofer, B., Niekum, S., Topcu, U.: Safe reinforcement learning via shielding. In: AAAI (2018)
3. Altman, E.: Constrained Markov Decision Processes. CRC Press, Boca Raton (1999)
4. Barth-Maron, G., et al.: Distributed distributional deterministic policy gradients. In: International Conference on Learning Representations (2018)
5. Bellemare, M.G., Dabney, W., Munos, R.: A distributional perspective on reinforcement learning, pp. 449–458 (2017)