1. Baxter, J., Bartlett, P.L.: Infinite-horizon gradient-based policy search. J. of Artificial Intelligence Res. 15, 319–350 (2001)
2. Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods. Athena Scientific (1997)
3. Chang, Y.-H., Ho, T., Kaelbling, L.P.: All learning is local: Multi-agent learning in global reward games. In: Advances in Neural Information Processing Systems, vol. 16 (2004)
4. Fernandez, F., Parker, L.E.: Learning in large cooperative multi-robot domains. Int. J. of Robotics and Automation 16(4), 217–226 (2001)
5. Guestrin, C., Koller, D., Parr, R.: Multiagent planning with factored MDPs. In: Advances in Neural Information Processing Systems, vol. 14 (2002)