1. Q-learning;Watkins;Machine Learning,1992
2. Delayed reinforcement, fuzzy Q-learning and fuzzy logic controllers;Bonarini,1996
3. Learning to act using real-time dynamic programming;Barto;Artificial Intelligence,1995
4. Reinforcement learning is direct adaptive optimal control;Sutton,1991
5. Reinforcement Learning: An Introduction;Sutton,1998