The Expected Total Cost Criterion for Markov Decision Processes under Constraints: A Convex Analytic Approach-Reference-Cited by-同舟云学术

The Expected Total Cost Criterion for Markov Decision Processes under Constraints: A Convex Analytic Approach

Published:2012-09 Issue:3 Volume:44 Page:774-793
ISSN:0001-8678
Container-title:Advances in Applied Probability
language:en
Short-container-title:Advances in Applied Probability

Author:

Dufour François,Horiguchi M.,Piunovskiy A. B.

Abstract

This paper deals with discrete-time Markov decision processes (MDPs) under constraints where all the objectives have the same form of expected total cost over the infinite time horizon. The existence of an optimal control policy is discussed by using the convex analytic approach. We work under the assumptions that the state and action spaces are general Borel spaces, and that the model is nonnegative, semicontinuous, and there exists an admissible solution with finite cost for the associated linear program. It is worth noting that, in contrast to the classical results in the literature, our hypotheses do not require the MDP to be transient or absorbing. Our first result ensures the existence of an optimal solution to the linear program given by an occupation measure of the process generated by a randomized stationary policy. Moreover, it is shown that this randomized stationary policy provides an optimal solution to this Markov control problem. As a consequence, these results imply that the set of randomized stationary policies is a sufficient set for this optimal control problem. Finally, our last main result states that all optimal solutions of the linear program coincide on a special set with an optimal occupation measure generated by a randomized stationary policy. Several examples are presented to illustrate some theoretical issues and the possible applications of the results developed in the paper.

Publisher

Cambridge University Press (CUP)

Subject

Applied Mathematics,Statistics and Probability

Reference16 articles.

1. On dynamic programming: Compactness of the space of policies

2. Convex Analysis

3. Markov Decision Processes

4. Optimal Control of Random Sequences in Problems with Constraints

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Maximizing the probability of visiting a set infinitely often for a Markov decision process with Borel state and action spaces;Journal of Applied Probability;2024-08-22

2. On the Continuity of the Projection Mapping from Strategic Measures to Occupation Measures in Absorbing Markov Decision Processes;Applied Mathematics & Optimization;2024-04-12

3. Extreme Occupation Measures in Markov Decision Processes with an Absorbing State;SIAM Journal on Control and Optimization;2024-01-12

4. On the structure of optimal solutions in a mathematical programming problem in a convex space;Operations Research Letters;2023-09

5. Duality in optimal impulse control;Journal of Mathematical Analysis and Applications;2022-05