1. Abbeel, P., & Ng, A. Y. (2005). Exploration and apprenticeship learning in reinforcement learning. In Proceedings of the Twentyfirst International Conference on Machine Learning (pp. 1-8).
2. Aberdeen, D., & Baxter, J. (2002). Scalable Internal-State Policy-Gradient Methods for POMDPs. In Proceedings of the Nineteenth International Conference on Machine Learning (pp. 3-10).
3. Gradient Descent for General Reinforcement Learning.;L.Baird;Advances in Neural Information Processing Systems,1999
4. Boutilier, C., & Poole, D. (1996). Computing Optimal Policies for Partially Observable Decision Processes using Compact Representations. In Proceedings of the Thirteenth National Conference on Artificial Intelligence (pp. 1168-1175).
5. Chrisman, L. (1992). Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach. In Proceedings of the Tenth National Conference on Artificial Intelligence (pp. 183-188).