Safe Policy Improvement in Constrained Markov Decision Processes-Reference-Cited by-同舟云学术

Safe Policy Improvement in Constrained Markov Decision Processes

Published:2022 Issue: Volume: Page:360-381
ISSN:0302-9743
Container-title:Leveraging Applications of Formal Methods, Verification and Validation. Verification Principles
language:
Short-container-title:

Author:

Berducci Luigi,Grosu Radu

Publisher

Springer International Publishing

Link

https://link.springer.com/content/pdf/10.1007/978-3-031-19849-6_21

Reference65 articles.

1. Abels, A., Roijers, D., Lenaerts, T., Nowé, A., Steckelmacher, D.: Dynamic weights in multi-objective deep reinforcement learning. In: International Conference on Machine Learning, pp. 11–20. PMLR (2019)

2. Achiam, J., Held, D., Tamar, A., Abbeel, P.: Constrained policy optimization. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6–11 August 2017. Proceedings of Machine Learning Research, vol. 70, pp. 22–31. PMLR (2017). http://proceedings.mlr.press/v70/achiam17a.html

3. Agha, G., Palmskog, K.: A survey of statistical model checking. ACM Trans. Model. Comput. Simul. (TOMACS) 28(1), 1–39 (2018)

4. Alshiekh, M., Bloem, R., Ehlers, R., Könighofer, B., Niekum, S., Topcu, U.: Safe reinforcement learning via shielding. CoRR arXiv:1708.08611 (2017)

5. Altman, E.: Constrained markov decision processes with total cost criteria: Lagrangian approach and dual linear program. Math. Methods Oper. Res. 48(3), 387–417 (1998)

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. X-by-Construction Meets Runtime Verification;Leveraging Applications of Formal Methods, Verification and Validation. Verification Principles;2022