1. Achiam, J., Held, D., Tamar, A., & Abbeel, P. (2017). Constrained policy optimization. In D. Precup & Y. W. Teh (Eds.), 34th international conference on machine learning. Proceedings of machine learning research (Vol. 70, pp. 22–31). MLR Press.
2. Alshiekh, M., Bloem, R., Ehlers, R., et al. (2018). Safe reinforcement learning via shielding. In 32nd AAAI conference on artificial intelligence (Vol. 32, No. 1, pp. 2669–2678). https://doi.org/10.1609/aaai.v32i1.11797. https://ojs.aaai.org/index.php/AAAI/article/view/11797.
3. Altman, E. (1999). Constrained Markov decision processes. Boca Raton: CRC Press.
4. Boutilier, C., & Lu, T. (2016). Budget allocation using weakly coupled, constrained Markov decision processes. In Proceedings of the 32nd conference on uncertainty in artificial intelligence (pp. 52–61).
5. Carrara, N., Leurent, E., Laroche, R., et al. (2019). Budgeted reinforcement learning in continuous state space. In Advances in neural information processing systems (Vol. 32). Curran Associates, Inc.