Abstract
A Markov process in discrete time with a finite state space is controlled by choosing the transition probabilities from a given convex family of distributions depending on the present state. The immediate cost is prescribed for each choice and it is required to minimise the average expected cost over an infinite future. The paper considers a special case of this general problem and provides the foundation for a general solution. The main result is that an optimal policy exists if each state of the system can be reached with positive probability from any other state by choosing a suitable policy.
Publisher
Cambridge University Press (CUP)
Subject
Applied Mathematics,Statistics and Probability
Cited by
53 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献