BioVerbNet: a large semantic-syntactic classification of verbs in biomedicine-Reference-Cited by-同舟云学术

BioVerbNet: a large semantic-syntactic classification of verbs in biomedicine

Published:2021-07-15 Issue:1 Volume:12 Page:
ISSN:2041-1480
Container-title:Journal of Biomedical Semantics
language:en
Short-container-title:J Biomed Semant

Author:

Majewska Olga^ORCID,Collins Charlotte,Baker Simon,Björne Jari,Brown Susan Windisch,Korhonen Anna,Palmer Martha

Abstract

Abstract Background Recent advances in representation learning have enabled large strides in natural language understanding; However, verbal reasoning remains a challenge for state-of-the-art systems. External sources of structured, expert-curated verb-related knowledge have been shown to boost model performance in different Natural Language Processing (NLP) tasks where accurate handling of verb meaning and behaviour is critical. The costliness and time required for manual lexicon construction has been a major obstacle to porting the benefits of such resources to NLP in specialised domains, such as biomedicine. To address this issue, we combine a neural classification method with expert annotation to create BioVerbNet. This new resource comprises 693 verbs assigned to 22 top-level and 117 fine-grained semantic-syntactic verb classes. We make this resource available complete with semantic roles and VerbNet-style syntactic frames. Results We demonstrate the utility of the new resource in boosting model performance in document- and sentence-level classification in biomedicine. We apply an established retrofitting method to harness the verb class membership knowledge from BioVerbNet and transform a pretrained word embedding space by pulling together verbs belonging to the same semantic-syntactic class. The BioVerbNet knowledge-aware embeddings surpass the non-specialised baseline by a significant margin on both tasks. Conclusion This work introduces the first large, annotated semantic-syntactic classification of biomedical verbs, providing a detailed account of the annotation process, the key differences in verb behaviour between the general and biomedical domain, and the design choices made to accurately capture the meaning and properties of verbs used in biomedical texts. The demonstrated benefits of leveraging BioVerbNet in text classification suggest the resource could help systems better tackle challenging NLP tasks in biomedicine.

Funder

European Research Council

Publisher

Springer Science and Business Media LLC

Subject

Computer Networks and Communications,Health Informatics,Computer Science Applications,Information Systems

Link

https://link.springer.com/content/pdf/10.1186/s13326-021-00247-z.pdf

Reference64 articles.

1. Schuyler PL, Hole WT, Tuttle MS, Sherertz DD. The UMLS Metathesaurus: representing different views of biomedical concepts. Bull Med Libr Assoc. 1993; 81(2):217.

2. Ananiadou S, Mcnaught J. Text Mining for Biology and Biomedicine. London: Artech House; 2006.

3. Venturi G, Montemagni S, Marchi S, Sasaki Y, Thompson P, McNaught J, Ananiadou S. Bootstrapping a verb lexicon for biomedical information extraction. In: International Conference on Intelligent Text Processing and Computational Linguistics. Springer: 2009. p. 137–48. https://doi.org/10.1007/978-3-642-00382-0_11.

4. Tan H. A system for building FrameNet-like corpus for the biomedical domain. In: Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis (Louhi). Association for Computational Linguistics: 2014. p. 46–53. https://doi.org/10.3115/v1/w14-1107.

5. Mondal A, Das D, Cambria E, Bandyopadhyay S. WME 3.0: An enhanced and validated lexicon of medical concepts. In: Proceedings of the 9th Global WordNet Conference (GWC). Nanyang Technological University (NTU): Global Wordnet Association: 2018. p. 10–6. https://aclanthology.org/2018.gwc-1.2.

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. My Big, Fat 50-Year Journey;Computational Linguistics;2024-01-15

2. VerbAligNet: Unlocking Multilingual Exploration of Verbal Arguments;Communications in Computer and Information Science;2024

3. The robotic-surgery propositional bank;Language Resources and Evaluation;2023-06-13

4. MedLexSp – a medical lexicon for Spanish medical natural language processing;Journal of Biomedical Semantics;2023-02-02

5. An Ensemble Semantic Textual Similarity Measure Based on Multiple Evidences for Biomedical Documents;Computational and Mathematical Methods in Medicine;2022-08-27