Affiliation:
1. Health Economics and Evidence Development, Novartis Oncology, East Hanover, NJ
2. Mendel, San Jose, CA
3. Novartis Pharma SAS, Rueil-Malmaison, Paris, France
4. Novartis Healthcare Private Limited, Hyderabad, India
Abstract
PURPOSE Pancreatic cancer (PaC) is often diagnosed at advanced stages, resulting in one of the lowest survival rates among patients with cancer. The purpose of this study was to investigate whether machine learning (ML) models can predict with high sensitivity and specificity an increased risk for PaC ahead of clinical diagnosis. METHODS Optum deidentified electronic health record (EHR) data set was used to extract 1-year data for each patient and to sample for PaC diagnosis, the number of interactions with the health care system, and unique demographic and clinical features. Data for patients with PaC diagnosis were collected between 1 and 2 years before the diagnosis. Standard binary classification ML models were used on training and testing data sets. Data analyses were performed using the scikit-learn package version 1.0.1. RESULTS The data set consisted of 18,987 patient EHRs collected between December 31, 2007, and December 31, 2017. EHRs with 10 unique features and at least three health care interactions were used for model training (N = 15,189; n = 8,438 [56%] with PaC) and testing (N = 3,798; n = 2,127 [56%] with PaC). The ensemble model achieved an AUC of 0.89, a sensitivity of 85.61%, and a specificity of 76.18% on the testing data set and produced superior results compared with other binary classifiers. Increasing unique health care interactions to nine failed to improve the AUC score. When the testing data set was enlarged to 5,696 patients, the ensemble model achieved an AUC of 0.92 and a specificity of 93.21%, but the sensitivity was compromised. CONCLUSION The ensemble model exceeded the state-of-the-art level of performance for prediction of PaC ahead of clinical diagnosis with a minimal clinically guided input, providing a potential strategy for selection of high-risk patients for further screening.
Publisher
American Society of Clinical Oncology (ASCO)
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献