Abstract
AbstractThis paper studies constrained Markov decision processes under the total expected discounted cost optimality criterion, with a state-action dependent discount factor that may take any value between zero and one. Both the state and the action space are assumed to be Borel spaces. By using the linear programming approach, consisting in stating the control problem as a linear problem on a set of occupation measures, we show the existence of an optimal stationary Markov policy. Our results are based on the study of both weak-strong topologies in the space of occupation measures and Young measures in the space of Markov policies.
Funder
Ministerio de Ciencia e Innovación
Consejo Nacional de Ciencia y Tecnología
Publisher
Springer Science and Business Media LLC
Reference38 articles.
1. Aliprantis, C.D., Border, K.C.: Infinite Dimensional Analysis: A Hitchhiker’s Guide, 3rd edn. Springer, Berlin (2006)
2. Altman, E.: Constrained Markov decision processes with total cost criteria: occupation measures and primal LP. Math. Methods Oper. Res. 43, 45–72 (1996)
3. Alvarez-Mena, J., Hernández-Lerma, O.: Convergence of the optimal values of constrained Markov control processes. Math. Methods Oper. Res. 55, 461–484 (2002)
4. Balder, E.J.: Lectures on Young Measure Theory and Its Applications in Economics. Rijksuniversiteit Utrecht. Mathematisch Instituut. Utrecht University, Utrecht (1998)
5. Balder, E.J.: On Cournot–Nash equilibrium distributions for games with differential information and discontinuous payoffs. Econ. Theory 1, 339–354 (1991)