Constrained Markov Decision Processes with Non-constant Discount Factor-Reference-Cited by-同舟云学术

Constrained Markov Decision Processes with Non-constant Discount Factor

Published:2024-05-30 Issue:2 Volume:202 Page:897-931
ISSN:0022-3239
Container-title:Journal of Optimization Theory and Applications
language:en
Short-container-title:J Optim Theory Appl

Author:

Jasso-Fuentes Héctor^ORCID,Prieto-Rumeau Tomás^ORCID

Abstract

AbstractThis paper studies constrained Markov decision processes under the total expected discounted cost optimality criterion, with a state-action dependent discount factor that may take any value between zero and one. Both the state and the action space are assumed to be Borel spaces. By using the linear programming approach, consisting in stating the control problem as a linear problem on a set of occupation measures, we show the existence of an optimal stationary Markov policy. Our results are based on the study of both weak-strong topologies in the space of occupation measures and Young measures in the space of Markov policies.

Funder

Ministerio de Ciencia e Innovación

Consejo Nacional de Ciencia y Tecnología

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s10957-024-02453-y.pdf

Reference38 articles.

1. Aliprantis, C.D., Border, K.C.: Infinite Dimensional Analysis: A Hitchhiker’s Guide, 3rd edn. Springer, Berlin (2006)

2. Altman, E.: Constrained Markov decision processes with total cost criteria: occupation measures and primal LP. Math. Methods Oper. Res. 43, 45–72 (1996)

3. Alvarez-Mena, J., Hernández-Lerma, O.: Convergence of the optimal values of constrained Markov control processes. Math. Methods Oper. Res. 55, 461–484 (2002)

4. Balder, E.J.: Lectures on Young Measure Theory and Its Applications in Economics. Rijksuniversiteit Utrecht. Mathematisch Instituut. Utrecht University, Utrecht (1998)

5. Balder, E.J.: On Cournot–Nash equilibrium distributions for games with differential information and discontinuous payoffs. Econ. Theory 1, 339–354 (1991)