Publisher
Springer Nature Singapore
Reference46 articles.
1. Abounadi, J., Bertsekas, D. P., & Borkar, V. S. (2001). Learning algorithms for Markov decision processes with average cost. SIAM Journal on Control and Optimization, 40(3), 681–698.
2. Bardi, M., & Capuzzo-Dolcetta, I. (1997). Optimal control and viscosity solutions of Hamilton-Jacobi-Bellman equations. Boston: Birkhäuser.
3. Barto, A., Sutton, R., & Anderson, C. (1983). Neuron-like elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man and Cybernetics, 13, 835–846.
4. Benaim, M., & Hirsch, M. W. (1996). Stochastic adaptive behavior for prisoner’s dilemma. Unpublished manuscript.
5. Benaim, M. (1997). Vertex-reinforced random walks and a conjecture of Pemantle. Annals of Probability, 25(1), 361–392.