An Ensemble Learning Approach for Cancer Drug Prediction

Author:

Mandera Darsh,Ritz AnnaORCID

Abstract

AbstractPredicting the response to a particular drug for specific cancer, despite known genetic mutations, still remains a huge challenge in modern oncology and precision medicine. Today, prescribing a drug for a cancer patient is based on a doctor’s analysis of various articles and previous clinical trials; it is an extremely time-consuming process. We developed a machine learning classifier to automatically predict a drug given a carcinogenic gene mutation profile. Using the Breast Invasive Carcinoma Dataset from The Cancer Genome Atlas (TCGA), the method first selects features from mutated genes and then applies K-Fold, Decision Tree, Random Forest and Ensemble Learning classifiers to predict best drugs. Ensemble Learning yielded prediction accuracy of 66% on the test set in predicting the correct drug. To validate that the model is general-purpose, Lung Adenocarcinoma (LUAD) data and Colorectal Adenocarcinoma (COADREAD) data from TCGA was trained and tested, yielding prediction accuracies 50% and 66% respectively. The resulting accuracy indicates a direct correlation between prediction accuracy and cancer data size. More importantly, the results of LUAD and COADREAD show that the implemented model is general purpose as it is able to achieve similar results across multiple cancer types. We further verified the validity of the model by implementing it on patients with unclear recovery status from the COADREAD dataset. In every case, the model predicted a drug that was administered to each patient. This method will offer oncologists significant time-saving compared to their current approach of extensive background research, and offers personalized patient care for cancer patients.

Publisher

Cold Spring Harbor Laboratory

Reference30 articles.

1. 1.10. Decision Trees. (2020). Retrieved from https://scikit-learn.org/stable/modules/tree.html

2. 1.6. Nearest Neighbors. (2019). Retrieved from https://scikit-learn.org/stable/modules/neighbors.html

3. 3.2.4.3.3. sklearn.ensemble.ExtraTreesClassifier. (2019). Retrieved August 03, 2020, from https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesClassifier.html

4. Baselga, J. , Norton, L. , Albanell, J. , Kim, Y. , & Mendelsohn, J. (1998, July 01). Recombinant Humanized Anti-HER2 Antibody (Herceptin™) Enhances the Antitumor Activity of Paclitaxel and Doxorubicin against HER2/neu Overexpressing Human Breast Cancer Xenografts. Retrieved from https://cancerres.aacrjournals.org/content/58/13/2825.short

5. Bashiri, A. , Ghazisaeedi, M. , Safdari, R. , Shahmoradi, L. , & Ehtesham, H. (2017, February). Improving the Prediction of Survival in Cancer Patients by Using Machine Learning Techniques: Experience of Gene Expression Data: A Narrative Review. Retrieved from https://ncbi.nlm.nih.gov/pmc/articles/PMC5402773/

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3