Document Summarization with Latent Queries

Author:

Xu Yumo1,Lapata Mirella2

Affiliation:

1. Institute for Language, Cognition and Computation, School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB, United Kingdom. yumo.xu@ed.ac.uk

2. Institute for Language, Cognition and Computation, School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB, United Kingdom. mlap@inf.ed.ac.uk

Abstract

Abstract The availability of large-scale datasets has driven the development of neural models that create generic summaries for single or multiple documents. For query-focused summarization (QFS), labeled training data in the form of queries, documents, and summaries is not readily available. We provide a unified modeling framework for any kind of summarization, under the assumption that all summaries are a response to a query, which is observed in the case of QFS and latent in the case of generic summarization. We model queries as discrete latent variables over document tokens, and learn representations compatible with observed and unobserved query verbalizations. Our framework formulates summarization as a generative process, and jointly optimizes a latent query model and a conditional language model. Despite learning from generic summarization data only, our approach outperforms strong comparison systems across benchmarks, query types, document settings, and target domains.1

Publisher

MIT Press - Journals

Subject

Artificial Intelligence,Computer Science Applications,Linguistics and Language,Human-Computer Interaction,Communication

Reference52 articles.

1. Towards generating query to perform query focused abstractive summarization using pre-trained model;Abdullah,2020

2. Improving query focused summarization using look-ahead strategy;Badrinath,2011

3. MS MARCO: A human generated machine reading comprehension dataset;Bajaj;arXiv preprint arXiv:1611.09268,2016

4. Topic concentration in query focused summarization datasets;Baumel,2016

5. Query focused abstractive summarization: Incorporating query relevance, multi-document coverage, and summary length constraints into seq2seq models;Baumel;arXiv preprint arXiv:1801 .07704,2018

Cited by 7 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. ReQuEST: A Small-Scale Multi-Task Model for Community Question-Answering Systems;IEEE Access;2024

2. Review on Query-focused Multi-document Summarization (QMDS) with Comparative Analysis;ACM Computing Surveys;2023-08-26

3. Query-Based Extractive Text Summarization Using Sense-Oriented Semantic Relatedness Measure;Arabian Journal for Science and Engineering;2023-08-18

4. Plan and generate: Explicit and implicit variational augmentation for multi-document summarization of scientific articles;Information Processing & Management;2023-07

5. Improving Multi-Document Summarization with GRU-BERT Network;2023 International Conference on Recent Advances in Electrical, Electronics & Digital Healthcare Technologies (REEDCON);2023-05-01

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3