Abstract
AbstractPharmacogenomics (PGx) studies how individual gene variations impact drug response phenotypes, which makes knowledge related to PGx a key component towards precision medicine. A significant part of the state-of-the-art knowledge in PGx is accumulated in scientific publications, where it is hardly usable to humans or software. Natural language processing techniques have been developed and are indeed employed for guiding experts curating this amount of knowledge. But, existing works are limited by the absence of high quality annotated corpora focusing on the domain. This absence restricts in particular the use of supervised machine learning approaches. This article introduces PGxCorpus, a manually annotated corpus, designed for the automatic extraction of PGx relationships from text. It comprises 945 sentences from 911 PubMed abstracts, annotated with PGx entities of interest (mainly genes variations, gene, drugs and phenotypes), and relationships between those. We present in this article the method used to annotate consistently texts, and a baseline experiment that illustrates how this resource may be leveraged to synthesize and summarize PGx knowledge.
Publisher
Cold Spring Harbor Laboratory
Reference56 articles.
1. Christopher M. Bishop . Pattern recognition and machine learning, 5th Edition. Information science and statistics. Springer, 2007.
2. Snpphena: a corpus for extracting ranked associations of single-nucleotide poly-morphisms and phenotypes from literature;J. Biomedical Semantics,2017
3. Leonardo Campillos , Louise Deléger , Cyril Grouin , Thierry Hamon , Anne-Laure Ligozat , and Au-rélie Névéol . A french clinical corpus with comprehensive semantic annotations: development of the medical entity and relation limsi annotated text corpus (merlot). Language Resources and Evaluation, pages 1–31, 2017.
4. Incorporation of Pharmacogenomics into Routine Clinical Practice: the Clinical Pharmacogenetics Implementation Consortium (CPIC) Guideline Development Process
5. Extracting and characterizing gene–drug relationships from the literature;Pharmacogenetics and Genomics,2004
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献