Investigating the impact of key tumor markers to predict and reduction of lymphoma cancer diagnosis duration with a data mining approach

Author:

Ghorbian MohsenORCID

Abstract

Abstract Various data mining techniques are available today, resulting in different results with varying precisions; therefore, selecting the appropriate methodology can result in a more complete and accurate data analysis. Hence, there are several ways to evaluate the effectiveness of data mining techniques. Choosing the appropriate data mining techniques depends on the type of data on which they will be implemented. When it comes to using data, data in every field has its significance. However, data plays a more significant aspect in specific fields, such as healthcare and data collection for caners. Using data mining techniques to analyze sensitive data like cancers can be challenging if the available information is incomplete, which can significantly impact the results. When working with the information of people with lymphoma cancer, the frequency of factors causing the disease and the lack of information are significant challenges. Lymphoma cancers can be classified as either Hodgkin's disease or non-Hodgkin's disease, which are common cancers. In this research, the criterion for selecting factors tumor markers is the presence of commonality between two types of lymphoma cancer. Five tumor markers, CD3, CD15, CD20, CD30, and LCA, along with the type of lymphoma cancer and the patient's gender, were selected as the variables of this research. Hence, to evaluate two data mining techniques, the Bayesian Networks (Naive Bayes), and the decision tree, we will apply the criteria of accuracy, sensitivity, f-score, and error ratio. However, to determine whether lymphoma cancer diagnosis factors have a positive impact, a 90% confidence interval and a 65% support value have been selected to take into account the highest level of accuracy when determining which factor is effective in diagnosing lymphoma cancer. Based on the implementation of techniques and evaluations, it was determined that the decision tree technique outperformed the Bayesian Networks (Naive Bayes) technique with an accuracy of 82.66%, a sensitivity of 94.98%, a harmonic mean of 85.36%, and an error ratio of 17.33%.Our research also concluded that the presence of CD3 and CD15 positive tumor markers, .also the gender of the individual, do not play a role in the diagnosis of lymphoma cancer. However, CD20 and LCA tumor markers can be effective in diagnosing non-Hodgkin's lymphoma, while CD30 tumor markers can be effective in diagnosing Hodgkin's lymphoma.

Publisher

Research Square Platform LLC

Reference24 articles.

1. Padhy N, Mishra D, Panigrahi R. The survey of data mining applications and feature scope. ArXiv preprint arXiv: 12115723. 2012.

2. Data mining with big data;Wu X;IEEE transactions on knowledge and data engineering,2014

3. Fayyad UM, Piatetsky-Shapiro G, Smyth P, Uthurusamy R. Advances in knowledge discovery and data mining. 1996.

4. Introduction to the special issue on data mining for health informatics;Ng RT;ACM SIGKDD Explorations Newsletter,2007

5. Data mining for health executive decision support: an imperative with a daunting future!;Glover S;Health services management research,2010

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3