Abstract
We consider a non-stationary Bayesian dynamic decision model with general state, action and parameter spaces. It is shown that this model can be reduced to a non-Markovian (resp. Markovian) decision model with completely known transition probabilities. Under rather weak convergence assumptions on the expected total rewards some general results are presented concerning the restriction on deterministic generalized Markov policies, the criteria of optimality and the existence of Bayes policies. These facts are based on the above transformations and on results of Hindererand Schäl.
Publisher
Cambridge University Press (CUP)
Subject
Applied Mathematics,Statistics and Probability
Reference21 articles.
1. Positive dynamic programming;Blackwell;Proc. Fifth Berkeley Symp. Math. Statist. Prob.,1967
2. Negative Dynamic Programming
3. Fundamental theorems in a Bayes controlled process;Furukawa;Bull. Math. Statist.,1970
Cited by
67 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献