1. Gradient descent for general reinforcement learning;Baird,1999
2. Reinforcement learning in pomdp's via direct gradient ascent;Baxter,2000
3. Dynamic Programming;Bellman,1957
4. On-line learning and the metrical task system problem;Blum,1997
5. Convergence problems of general-sum multiagent reinforcement learning;Bowling,2000