Affiliation:
1. Chinese Academy of Medical Sciences
Abstract
Abstract
Background With the increasing amount of scientific and technical literature available, it has posed difficulties for deeper knowledge discovery. Biomedical semantic relationship extraction can reveal important biomedical entities and the semantic relationships between them, which is an important basis for biomedical knowledge discovery, clinical decision making and other applications. Identifying the causative relationships of diseases is a significant research field, which can help speed up the discovery of underlying mechanisms of diseases and promote better prevention and treatment of diseases.
Methods This study aims to optimize the automatic extraction of disease causality of SemRep tool by constructing a semantic predicate vocabulary that specifically conveys disease causality, allowing for the discovery of disease causality within the biomedical literature. We extracted semantic feature words based on existing research and the parsing and recognition results of literature using SemRep. We then filtered and evaluated textual semantic predicates according to the semantic feature words and constructed a semantic predicate vocabulary expressing disease causality.
Results By improving the automatic extraction of disease causality pairs, the proposed method would facilitate better disease causality mining from biomedical literature. We constructed a semantic predicate vocabulary expressing disease causality using 50 predicates with an accuracy of at least 40%.
Conclusions The approach of using optimized semantic predicates to discover disease causality from large-scale biomedical literature is feasible. It can provide insights for the extraction of other types of semantic relationships and for machine learning methods as well, thus contributing to the discovery and exploitation of disease causality knowledge and supporting clinical diagnosis and disease prevention and control.
Publisher
Research Square Platform LLC