Abstract
AbstractInfusing structured semantic representations into language models is a rising research trend underpinning many natural language processing tasks that require understanding and reasoning capabilities. Decoupling factual non-ambiguous concept units from the lexical surface holds great potential in abstractive summarization, especially in the biomedical domain, where fact selection and rephrasing are made more difficult by specialized jargon and hard factuality constraints. Nevertheless, current graph-augmented contributions rely on extractive binary relations, failing to model real-world n-ary and nested biomedical interactions mentioned in the text. To alleviate this issue, we present EASumm, the first framework for biomedical abstractive summarization empowered by event extraction, namely graph-based representations of relevant medical evidence derived from the source scientific document. By relying on dual text-graph encoders, we prove the promising role of explicit event structures, achieving better or comparable performance than previous state-of-the-art models on the CDSR dataset. We conduct extensive ablation studies, including a wide experimentation of graph representation learning techniques. Finally, we offer some hints to guide future research in the field.
Funder
Alma Mater Studiorum - Università di Bologna
Publisher
Springer Science and Business Media LLC
Subject
Computer Science Applications,Computer Networks and Communications,Computer Graphics and Computer-Aided Design,Computational Theory and Mathematics,Artificial Intelligence,General Computer Science
Reference88 articles.
1. Pinker S. The language instinct. New York: William Morrow & co; 1994.
2. Brown T, Mann B, Ryder N, Subbiah M, et al. Language models are few-shot learners. In: Larochelle H, Ranzato M, Hadsell R, Balcan MF, et al., editors. Advances in Neural Information Processing Systems, vol. 33. Virtual: Curran Associates Inc; 2020. p. 1877–901. https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
3. Bender EM, Gebru T, McMillan-Major A, Shmitchell S. On the dangers of stochastic parrots: can language models be too big? In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT '21). New York, NY, USA: Association for Computing Machinery; 2021. p. 610–23.
4. Zhou C, Neubig G, Gu J, Diab M, Guzmán F, Zettlemoyer L, Ghazvininejad M. Detecting hallucinated content in conditional neural sequence generation. In: ACL/IJCNLP (Findings). Findings of ACL, vol. ACL/IJCNLP 2021. Bangkok: Association for Computational Linguistics; 2021. pp. 1393–404.
5. Zhang WE, Sheng QZ, Alhazmi A, Li C. Adversarial attacks on deep-learning models in natural language processing: a survey. ACM Trans Intell Syst Technol. 2020;11(3):24–12441. https://doi.org/10.1145/3374217.