Author:
Hafez Muhammad Burhan,Loo Chu Kiong
Publisher
Springer Science and Business Media LLC
Subject
Artificial Intelligence,Software
Reference30 articles.
1. Sutton RS (1991) Dyna, an integrated architecture for learning, planning, and reacting. ACM SIGART Bull 2(4):160–163
2. Moore AW, Atkeson CG (1993) Prioritized sweeping: reinforcement learning with less data and less time. Mach Learn 13(1):103–130
3. Hwang KS, Jiang WC, Chen YJ (2012) Tree-based Dyna-Q agent. The 2012 IEEE/ASME international conference on advanced intelligent mechatronics, Kaohsiung, Taiwan
4. Sutton RS (1988) Learning to predict by the methods of temporal differences. Mach Learn 3(1):9–44
5. Watkins CGCH (1989) Learning from delayed rewards. King’s College, Cambridge
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献