1. The complexity of decentralized control of markov decision processes;Bernstein;Mathematics of Operations Research,2002
2. Predictive state temporal difference learning;Boots,2010
3. Boots, B., & Gordon, G. (2011). An online spectral learning algorithm for partially observable dynamical systems. In Proceedings of the 25th AAAI conference on artificial intelligence (pp. 293–300).
4. Closing the learning-planning loop with predictive state representations;Boots;International Journal of Robotics Research,2010
5. Learning predictive state representations using non-blind policies;Bowling,2006