Leveraging SEER data through machine learning to predict distant lymph node metastasis and prognosticate outcomes in hepatocellular carcinoma patients

Author:

Sun Jiaxuan1,Huang Lei1,Liu Yahui1

Affiliation:

1. Department of Hepatobiliary and Pancreatic Surgery, General Surgery Center First Hospital of Jilin University Changchun China

Abstract

AbstractObjectivesThis study aims to develop and validate machine learning–based diagnostic and prognostic models to predict the risk of distant lymph node metastases (DLNM) in patients with hepatocellular carcinoma (HCC) and to evaluate the prognosis for this cohort.DesignUtilizing a retrospective design, this investigation leverages data extracted from the Surveillance, Epidemiology, and End Results (SEER) database, specifically the January 2024 subset, to conduct the analysis.ParticipantsThe study cohort consists of 15,775 patients diagnosed with HCC as identified within the SEER database, spanning 2016 to 2020.MethodIn the construction of the diagnostic model, recursive feature elimination (RFE) is employed for variable selection, incorporating five critical predictors: age, tumor size, radiation therapy, T‐stage, and serum alpha‐fetoprotein (AFP) levels. These variables are the foundation for a stacking ensemble model, which is further elucidated through Shapley Additive Explanations (SHAP). Conversely, the prognostic model is crafted utilizing stepwise backward regression to select pertinent variables, including chemotherapy, radiation therapy, tumor size, and age. This model culminates in the development of a prognostic nomogram, underpinned by the Cox proportional hazards model.Main outcome measuresThe outcome of the diagnostic model is the occurrence of DLNM in patients. The outcome of the prognosis model is determined by survival time and survival status.ResultsThe integrated model developed based on stacking demonstrates good predictive performance and high interpretative variability and differentiation. The area under the curve (AUC) in the training set is 0.767, while the AUC in the validation set is 0.768. The nomogram, constructed using the Cox model, also demonstrates consistent and strong predictive capabilities. At the same time, we recognized elements that have a substantial impact on DLNM and the prognosis and extensively discussed their significance in the model and clinical practice.ConclusionOur study identified key predictive factors for DLNM and elucidated significant prognostic indicators for HCC patients with DLNM. These findings provide clinicians with valuable tools to accurately identify high‐risk individuals for DLNM and conduct more precise risk stratification for this patient subgroup, potentially improving management strategies and patient outcomes.

Funder

National Key Research and Development Program of China

Publisher

Wiley

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3