1. Christopher Amato, Jilles Steeve Dibangoye, Shlomo Zilberstein, Incremental policy generation for finite-horizon Dec-POMDPs, in: Proceedings of the 19th International Conference on Automated Planning and Scheduling (ICAPS), Thessaloniki, Greece, September 19–23, AAAI, 2009.
2. An investigation into mathematical programming for finite horizon decentralized POMDPs;Aras;J. Artif. Intell. Res.,2010
3. Using confidence bounds for exploitation-exploration trade-offs;Auer;J. Mach. Learn. Res.,2002
4. Bikramjit Banerjee, Jeremy Lyle, Landon Kraemer, Rajesh Yellamraju. Sample bounded distributed reinforcement learning for decentralized POMDPs, in: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI-12), Toronto, Canada, July 2012, pp. 1256–1262.
5. Bikramjit Banerjee, Peter Stone, General game learning using knowledge transfer, in: Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI-07), Hyderabad, India, 2007, pp. 672–677.