1. Abbeel, P., Ng, A. Y. (2004). Apprenticeship learning via inverse reinforcement learning. In Proceedings of the 21st International Conference on Machine learning (ICML’04). Banff, AB, Canada.
2. Boularias, A., Chinaei, H. R., & Chaib-draa, B., (2010). Learning the reward model of dialogue POMDPs from data. In NIPS 2010 Workshop on Machine Learning for Assistive Technologies. Vancouver, BC, Canada.
3. Boularias, A., Kober, J., & Peters, J. (2011). Relative entropy inverse reinforcement learning. Journal of Machine Learning Research—Proceedings Track, 15, 182–189.
4. Chandramohan, S., Geist, M., Lefèvre, F., & Pietquin, O. (2012). Behavior specific user simulation in spoken dialogue systems. In Proceedings of the IEEE ITG Conference on Speech Communication. Braunschweig, Germany.
5. Chinaei, H. R., & Chaib-draa, B. (2011). Learning dialogue POMDP models from data. In Proceedings of the 24th Canadian Conference on Advances in Artificial Intelligence (Canadian AI’11). St. John’s, NL, Canada.