Abstract
Concept-to-text generation refers to the task of automatically producing textual output from non-linguistic input. We present a joint model that captures content selection ("what to say") and surface realization ("how to say") in an unsupervised domain-independent fashion. Rather than breaking up the generation process into a sequence of local decisions, we define a probabilistic context-free grammar that globally describes the inherent structure of the input (a corpus of database records and text describing some of them). We recast generation as the task of finding the best derivation tree for a set of database records and describe an algorithm for decoding in this framework that allows to intersect the grammar with additional information capturing fluency and syntactic well-formedness constraints. Experimental evaluation on several domains achieves results competitive with state-of-the-art systems that use domain specific constraints, explicit feature engineering or labeled data.
Cited by
23 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. WEATHERGOV+;Proceedings of the ACM Symposium on Document Engineering 2023;2023-08-22
2. SAN-T2T: An automated table-to-text generator based on selective attention network;Natural Language Engineering;2023-05-05
3. A Logic Aware Neural Generation Method for Explainable Data-to-text;Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2022-08-14
4. Retrieval Enhanced Segment Generation Neural Network for Task-Oriented Dialogue Systems;ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2022-05-23
5. Using natural language generation to bootstrap missing Wikipedia articles: A human-centric perspective;Semantic Web;2022-02-03