Affiliation:
1. BC Cancer, Vancouver, British Columbia, Canada
2. Department of Computer Science, University of British Columbia, Vancouver, British Columbia, Canada
3. Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
Abstract
ImportancePredicting short- and long-term survival of patients with cancer may improve their care. Prior predictive models either use data with limited availability or predict the outcome of only 1 type of cancer.ObjectiveTo investigate whether natural language processing can predict survival of patients with general cancer from a patient’s initial oncologist consultation document.Design, Setting, and ParticipantsThis retrospective prognostic study used data from 47 625 of 59 800 patients who started cancer care at any of the 6 BC Cancer sites located in the province of British Columbia between April 1, 2011, and December 31, 2016. Mortality data were updated until April 6, 2022, and data were analyzed from update until September 30, 2022. All patients with a medical or radiation oncologist consultation document generated within 180 days of diagnosis were included; patients seen for multiple cancers were excluded.ExposuresInitial oncologist consultation documents were analyzed using traditional and neural language models.Main Outcomes and MeasuresThe primary outcome was the performance of the predictive models, including balanced accuracy and receiver operating characteristics area under the curve (AUC). The secondary outcome was investigating what words the models used.ResultsOf the 47 625 patients in the sample, 25 428 (53.4%) were female and 22 197 (46.6%) were male, with a mean (SD) age of 64.9 (13.7) years. A total of 41 447 patients (87.0%) survived 6 months, 31 143 (65.4%) survived 36 months, and 27 880 (58.5%) survived 60 months, calculated from their initial oncologist consultation. The best models achieved a balanced accuracy of 0.856 (AUC, 0.928) for predicting 6-month survival, 0.842 (AUC, 0.918) for 36-month survival, and 0.837 (AUC, 0.918) for 60-month survival, on a holdout test set. Differences in what words were important for predicting 6- vs 60-month survival were found.Conclusions and RelevanceThese findings suggest that models performed comparably with or better than previous models predicting cancer survival and that they may be able to predict survival using readily available data without focusing on 1 cancer type.
Publisher
American Medical Association (AMA)
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献