Affiliation:
1. DÜZCE ÜNİVERSİTESİ, MÜHENDİSLİK FAKÜLTESİ
2. DÜZCE ÜNİVERSİTESİ, FEN BİLİMLERİ ENSTİTÜSÜ
Abstract
Data mining is the process of extracting useful information from large-scale data in an understandable and logical way. According to the main machine learning techniques of data mining; classification and regression, association rules and cluster analysis. Classification and regression are known as predictive models, and clustering and association rules are known as descriptive models. In this study, the classification method was used. With this method, it is aimed to assign a data set to one of the previously determined different classes. The data set used in the study was obtained from the UCIrvine Machine Learning Repository database. The dataset named “Breast cancer” consists of breast cancer data consisting of 699 samples and 10 features collected by William H. at the University of Wisconsin hospital. The data content includes information about the characteristics of some cells analyzed in the detection of breast cancer, cell division, and whether they are benign or malignant. Upon completion of the study, a classification process is performed by determining whether the targeted person has cancerous or non-cancerous cells. In the study carried out in this context; Data mining analyzes were performed using WEKA and Orange programs, SVM (Support Vector Machine), Random Forest algorithms. Along with the analysis results, a comparison was made on the data set, taking into account the previous studies. It is aimed that the conclusions obtained at the end of the study will guide medical professionals working in this field in the diagnosis of breast cancer.
Publisher
International Conference on Artificial Intelligence and Applied Mathematics in Engineering