Contextualizing Genes by Using Text-Mined Co-Occurrence Features for Cancer Gene Panel Discovery-Reference-Cited by-同舟云学术

Contextualizing Genes by Using Text-Mined Co-Occurrence Features for Cancer Gene Panel Discovery

Published:2021-10-25 Issue: Volume:12 Page:
ISSN:1664-8021
Container-title:Frontiers in Genetics
language:
Short-container-title:Front. Genet.

Author:

Chen Hui-O,Lin Peng-Chan,Liu Chen-Ruei,Wang Chi-Shiang,Chiang Jung-Hsien

Abstract

Developing a biomedical-explainable and validatable text mining pipeline can help in cancer gene panel discovery. We create a pipeline that can contextualize genes by using text-mined co-occurrence features. We apply Biomedical Natural Language Processing (BioNLP) techniques for literature mining in the cancer gene panel. A literature-derived 4,679 × 4,630 gene term-feature matrix was built. The EGFR L858R and T790M, and BRAF V600E genetic variants are important mutation term features in text mining and are frequently mutated in cancer. We validate the cancer gene panel by the mutational landscape of different cancer types. The cosine similarity of gene frequency between text mining and a statistical result from clinical sequencing data is 80.8%. In different machine learning models, the best accuracy for the prediction of two different gene panels, including MSK-IMPACT (Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets), and Oncomine cancer gene panel, is 0.959, and 0.989, respectively. The receiver operating characteristic (ROC) curve analysis confirmed that the neural net model has a better prediction performance (Area under the ROC curve (AUC) = 0.992). The use of text-mined co-occurrence features can contextualize each gene. We believe the approach is to evaluate several existing gene panels, and show that we can use part of the gene panel set to predict the remaining genes for cancer discovery.

Funder

Ministry of Science and Technology, Taiwan

Ministry of Health and Family Welfare

Publisher

Frontiers Media SA

Subject

Genetics (clinical),Genetics,Molecular Medicine

Reference47 articles.

1. & International Adjuvant Lung Cancer Trial Collaborative GroupCisplatin-Based Adjuvant Chemotherapy in Patients with Completely Resected Non-small-cell Lung Cancer;Arriagada;N. Engl. J. Med.,2004

2. Gene Ontology: Tool for the Unification of Biology. The Gene Ontology Consortium;Ashburner;Nat. Genet.,2000

3. Global Genetics Research in Prostate Cancer: A Text Minning and Computational Network Theory Approach;Azam;Front. Genet.,2019

4. Dual Kinase Inhibition in the Treatment of Breast Cancer: Initial Experience with the EGFR/ErbB-2 Inhibitor Lapatinib;Burris;Oncologist,2004

5. Interleukin-13 Inhibits Interleukin-2-Induced Proliferation and Protects Chronic Lymphocytic Leukemia B Cells from In Vitro Apoptosis;Chaouchi;Blood,1996

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Retrieval Augmented Therapy Suggestion for Molecular Tumor Boards (Preprint);2024-07-16

2. Non-Overlapping Block Processing of Cancer Genes Data for Earlier Prediction of Breast Cancer Diseases using Regression Algorithms;2023 IEEE International Conference on ICT in Business Industry & Government (ICTBIG);2023-12-08

3. Introducing AI to the molecular tumor board: one direction toward the establishment of precision medicine using large-scale cancer clinical and biological information;Experimental Hematology & Oncology;2022-10-31

4. Cutting-Edge AI Technologies Meet Precision Medicine to Improve Cancer Care;Biomolecules;2022-08-17

5. Identifying and Validating Networks of Oncology Biomarkers Mined From the Scientific Literature;Cancer Informatics;2022-01