MDPs as Distribution Transformers: Affine Invariant Synthesis for Safety Objectives-Reference-Cited by-同舟云学术

MDPs as Distribution Transformers: Affine Invariant Synthesis for Safety Objectives

Published:2023 Issue: Volume: Page:86-112
ISSN:0302-9743
Container-title:Computer Aided Verification
language:
Short-container-title:

Author:

Akshay S.^ORCID,Chatterjee Krishnendu^ORCID,Meggendorfer Tobias^ORCID,Žikelić Đorđe^ORCID

Abstract

AbstractMarkov decision processes can be viewed as transformers of probability distributions. While this view is useful from a practical standpoint to reason about trajectories of distributions, basic reachability and safety problems are known to be computationally intractable (i.e., Skolem-hard) to solve in such models. Further, we show that even for simple examples of MDPs, strategies for safety objectives over distributions can require infinite memory and randomization.In light of this, we present a novel overapproximation approach to synthesize strategies in an MDP, such that a safety objective over the distributions is met. More precisely, we develop a new framework for template-based synthesis of certificates as affine distributional and inductive invariants for safety objectives in MDPs. We provide two algorithms within this framework. One can only synthesize memoryless strategies, but has relative completeness guarantees, while the other can synthesize general strategies. The runtime complexity of both algorithms is in PSPACE. We implement these algorithms and show that they can solve several non-trivial examples.

Publisher

Springer Nature Switzerland

Link

https://link.springer.com/content/pdf/10.1007/978-3-031-37709-9_5

Reference64 articles.

1. Agrawal, M., Akshay, S., Genest, B., Thiagarajan, P.S.: Approximate verification of the symbolic dynamics of Markov chains. J. ACM 62(1), 2:1-2:34 (2015). https://doi.org/10.1145/2629417

2. Agrawal, S., Chatterjee, K., Novotný, P.: Lexicographic ranking supermartingales: an efficient approach to termination of probabilistic programs. Proc. ACM Program. Lang. 2(POPL), 34:1–34:32 (2018). https://doi.org/10.1145/3158122

3. Akshay, S., Antonopoulos, T., Ouaknine, J., Worrell, J.: Reachability problems for Markov chains. Inf. Process. Lett. 115(2), 155–158 (2015). https://doi.org/10.1016/j.ipl.2014.08.013

4. Akshay, S., Chatterjee, K., Meggendorfer, T., Đorđe Žikelić: MDPs as distribution transformers: affine invariant synthesis for safety objectives (2023). https://arxiv.org/abs/2305.16796

5. Akshay, S., Genest, B., Vyas, N.: Distribution-based objectives for markov decision processes. In: Dawar, A., Grädel, E. (eds.) Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2018, Oxford, UK, July 09–12, 2018, pp. 36–45. ACM (2018). https://doi.org/10.1145/3209108.3209185

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Skolem and positivity completeness of ergodic Markov chains;Information Processing Letters;2024-08