Abstract
AbstractInteractive NLP is a promising paradigm to close the gap between automatic NLP systems and the human upper bound. Preference-based interactive learning has been successfully applied, but the existing methods require several thousand interaction rounds even in simulations with perfect user feedback. In this paper, we study preference-based interactive summarisation. To reduce the number of interaction rounds, we propose the Active Preference-based ReInforcement Learning (APRIL) framework. APRIL uses active learning to query the user, preference learning to learn a summary ranking function from the preferences, and neural Reinforcement learning to efficiently search for the (near-)optimal summary. Our results show that users can easily provide reliable preferences over summaries and that APRIL outperforms the state-of-the-art preference-based interactive method in both simulation and real-user experiments.
Funder
Deutsche Forschungsgemeinschaft
Publisher
Springer Science and Business Media LLC
Subject
Library and Information Sciences,Information Systems
Reference64 articles.
1. Amershi, S., Cakmak, M., Knox, W. B., & Kulesza, T. (2014). Power to the people: The role of humans in interactive machine learning. AI Magazine, 35(4), 105–120.
2. Avinesh, P. V. S., & Meyer, C. M. (2017). Joint optimization of user-desired content in multi-document summaries by learning from user feedback. In Proceedings of the 55th annual meeting of the association for computational linguistics (ACL), July 30–August 4, 2017, Vancouver, Canada, Volume 1: Long papers (pp. 1353–1363).
3. Bertsekas, D. P., & Tsitsiklis, J. (1996). Neuro-dynamic programming. Belmont, MA: Athena Scientific.
4. Böhm, F., Gao, Y., Meyer, CM., Shapira, O., Dagan, I., & Gurevych, I. (2019). Better rewards yield better summaries: Learning to summarise without references. In Proceedings of the 2019 conference on empirical methods in natural language processing, Hong Kong, China, November 3–7, 2019.
5. Borisov, A., Wardenaar, M., Markov, I., & de Rijke, M. (2018). A click sequence model for web search. In The 41st international ACM SIGIR conference on research & development in information retrieval, Ann Arbor, MI, USA (pp. 45–54). https://doi.org/10.1145/3209978.3210004
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献