Reducing reinforcement learning to KWIK online regression-Reference-Cited by-同舟云学术

Reducing reinforcement learning to KWIK online regression

Published:2010-04 Issue:3-4 Volume:58 Page:217-237
ISSN:1012-2443
Container-title:Annals of Mathematics and Artificial Intelligence
language:en
Short-container-title:Ann Math Artif Intell

Author:

Li Lihong,Littman Michael L.

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Artificial Intelligence

Link

http://link.springer.com/content/pdf/10.1007/s10472-010-9201-2.pdf

Reference41 articles.

1. Asmuth, J., Li, L., Littman, M. L., Nouri, A., Wingate, D.: A Bayesian sampling approach to exploration in reinforcement learning. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI-09), pp. 19–26 (2009)

2. Auer, P.: Using confidence bounds for exploitation-exploration trade-offs. J. Mach. Learn. Res. 3, 397–422 (2002)

3. Bagnell, J.A., Kakade, S., Ng, A.Y., Schneider, J.: Policy search by dynamic programming. Adv. Neural Inf. Process. Syst. 16 (NIPS-03), 831–838 (2004)

4. Boyan, J.A., Moore, A.W.: Generalization in reinforcement learning: safely approximating the value function. Adv. Neural Inf. Process. Syst. 7, 369–376 (1995)

5. Brafman, R.I., Tennenholtz, M.: R-max—a general polynomial time algorithm for near-optimal reinforcement learning. J. Mach. Learn. Res. 3, 213–231 (2002)

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm;European Journal of Operational Research;2015-02

2. Agnostic Pointwise-Competitive Selective Classification;Journal of Artificial Intelligence Research;2015-01-26

3. Abstraction from demonstration for efficient reinforcement learning in high-dimensional domains;Artificial Intelligence;2014-11

4. Sample Complexity Bounds of Exploration;Adaptation, Learning, and Optimization;2012

5. Knows what it knows: a framework for self-aware learning;Machine Learning;2010-11-25