Abstract
We consider a finite controlled Markov chain, the description of which depends on an unknown parameter a, and investigate the following control policy. To each a an optimal stationary control is associated. a is estimated recurrently from the trajectory by the minimum contrast method, and the optimal stationary control corresponding to the estimate is used. We present asymptotic properties of the estimate and of the criterion function. They follow from the law of large numbers and from the central limit theorem for controlled Markov chains derived with the aid of martingales.
Publisher
Cambridge University Press (CUP)
Subject
Applied Mathematics,Statistics and Probability
Cited by
131 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献