1. Pattern recognition and machine learning;Bishop,2006
2. R-max—A general polynomial time algorithm for near-optimal reinforcement learning;Brafman;Journal of Machine Learning Research,2002
3. Active learning with statistical models;Cohn;Journal of Artificial Intelligence Research,1996
4. Theory of optimal experiments;Fedorov,1972
5. Adaptive importance sampling for value function approximation in off-policy reinforcement learning;Hachiya;Neural Networks,2009