Affiliation:
1. Universitas Gadjah Mada
2. Universitas Gadjah Mada Fakultas Kedokteran
3. Rumah Sakit Dharmais Pusat Kanker Nasional
4. Universitas Andalas Fakultas Kedokteran
Abstract
Abstract
Background: In 2018, estimated that 11.6% or 2,088,849 new breast cancer cases and 6.6% or 626,679 cases are predicted to end with mortality due to this disease. One of the causes of the high mortality rate due to breast cancer in Indonesia is found in 60-70% of patients identified with advanced-stage breast cancer. It is related to the perception of breast cancer risk in Indonesian women. Therefore, it is necessary to calculate the risk factors for breast cancer risk to help increase public awareness in recognizing the risk of breast cancer in Indonesia. This study was held to construct risk factors calculation model for breast cancer risk based on machine learning in Indonesia.Methods: This research was quantitative which was conducted using a case-control study design. Data were collected in Dr. M. Djamil General Hospital Padang, Sardjito General Hospital Yogyakarta and Dharmais Cancer Hospital Jakarta from July 2018-July 2019. The number of samples in this study were 1,000 women cases groups (breast cancer) and 1,000 women control groups (non-breast cancer) matching by age and sex. The sampling technique in this study was convenience sampling. Data were collected from medical records and primary data collection used a questionnaire. Chi-square test used for bivariate analysis and risk factors calculation were used machine learning algorithm Naive Bayes, decision tree, k-nearest neighbors, support vector machine and logistic regression. Determination of algorithm selection by comparing the highest accuracy, true positive rate, false-positive rate and Area Under Curve (AUC). STATA version 14.2 and the Waikato Environment for Knowledge Analysis (WEKA) version 3.6.4 were used to process the data.Results: The model construction for calculating risk factors for breast cancer risk in Indonesia is based on predictors of menopause age, the first age of pregnancy, the first and second-degree family history of breast cancer, use of oral contraceptives, history of smoking, overweight, obesity, high-fat diet, high-calorie diets and physical activity. The cut-off point in classifying high risk of breast cancer and low risk of breast cancer is based on a total score of > 7 (high risk of breast cancer) and ≤ 7 (low risk of breast cancer). The accuracy of the breast cancer risk factor calculation model in Indonesia was 79.9% with a sensitivity of 76.90% and a specificity of 70.4%.Conclusion: This breast cancer risk factor calculation can be categorized quite well in classifying breast cancer risk in Indonesia.
Publisher
Research Square Platform LLC
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献