What Should Be Observed for Optimal Reward in POMDPs?-Reference-Cited by-同舟云学术

What Should Be Observed for Optimal Reward in POMDPs?

Published:2024 Issue: Volume: Page:373-394
ISSN:0302-9743
Container-title:Lecture Notes in Computer Science
language:en
Short-container-title:

Author:

Konsta Alyzia-Maria^ORCID,Lluch Lafuente Alberto^ORCID,Matheja Christoph^ORCID

Abstract

AbstractPartially observable Markov Decision Processes (POMDPs) are a standard model for agents making decisions in uncertain environments. Most work on POMDPs focuses on synthesizing strategies based on the available capabilities. However, system designers can often control an agent’s observation capabilities, e.g. by placing or selecting sensors. This raises the question of how one should select an agent’s sensors cost-effectively such that it achieves the desired goals. In this paper, we study the novel optimal observability problem (oop): Given a POMDP

$$\mathscr {M}$$

M , how should one change

$$\mathscr {M}$$

M ’s observation capabilities within a fixed budget such that its (minimal) expected reward remains below a given threshold? We show that the problem is undecidable in general and decidable when considering positional strategies only. We present two algorithms for a decidable fragment of the oop: one based on optimal strategies of

$$\mathscr {M}$$

M ’s underlying Markov decision process and one based on parameter synthesis with SMT. We report promising results for variants of typical examples from the POMDP literature.

Publisher

Springer Nature Switzerland

Link

https://link.springer.com/content/pdf/10.1007/978-3-031-65633-0_17

Reference30 articles.

1. Åström, K.J.: Optimal control of Markov processes with incomplete state information I. J. Math. Anal. Appl. 10, 174–205 (1965)

2. Baier, C., Katoen, J.-P.: Principles of Model Checking. MIT press (2008)

3. Canny, J.F.: Some algebraic and geometric computations in PSPACE. In: STOC, pp. 460–467. ACM (1988)

4. Lecture Notes in Computer Science;P Černý,2011

5. Chades, I., Carwardine, J., Martin, T.G., Nicol, S., Sabbadin, R., Buffet, O.: MOMDPs: a solution for modelling adaptive management problems. In: AAAI, pp. 267–273. AAAI Press (2012)