Value iteration in a class of average controlled Markov chains with unbounded costs: necessary and sufficient conditions for pointwise convergence-Reference-Cited by-同舟云学术

Value iteration in a class of average controlled Markov chains with unbounded costs: necessary and sufficient conditions for pointwise convergence

Published:1996-12 Issue:4 Volume:33 Page:986-1002
ISSN:0021-9002
Container-title:Journal of Applied Probability
language:en
Short-container-title:Journal of Applied Probability

Author:

Cavazos-Cadena Rolando,Fernández-Gaucherand Emmanuel

Abstract

This work concerns controlled Markov chains with denumerable state space, (possibly) unbounded cost function, and an expected average cost criterion. Under a Lyapunov function condition, together with mild continuity-compactness assumptions, a simple necessary and sufficient criterion is given so that the relative value functions and differential costs produced by the value iteration scheme converge pointwise to the solution of the optimality equation; this criterion is applied to obtain convergence results when the cost function is bounded below or bounded above.

Publisher

Cambridge University Press (CUP)

Subject

Statistics, Probability and Uncertainty,General Mathematics,Statistics and Probability

Reference22 articles.

1. DISCOUNTED AND UNDISCOUNTED VALUE-ITERATION IN MARKOV DECISION PROBLEMS: A SURVEY

2. Equivalence of Lyapunov stability criteria in a class of Markov decision processes

3. Denumerable controlled Markov chains with average reward criterion: Sample path optimality

4. Cavazos-Cadena R. (1995) Undiscounted value iteration in stable Markov decision processes with bounded rewards. J. Math. Systems, Estimation and Control. To appear.

Cited by 10 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Sample-Path Optimality in Average Markov Decision Chains Under a Double Lyapunov Function Condition;Optimization, Control, and Applications of Stochastic Systems;2012

2. Denumerable-state continuous-time Markov decision processes with unbounded transition and reward rates under the discounted criterion;Journal of Applied Probability;2002-06

3. Stability, Performance Evaluation, and Optimization;International Series in Operations Research & Management Science;2002

4. Optimality Conditions for CTMDP with Average Cost Criterion;Markov Processes and Controlled Markov Chains;2002

5. Adaptive control of average Markov decision chains under the Lyapunov stability condition;Mathematical Methods of Operations Research (ZOR);2001-10-01