Affiliation:
1. Instituto Tecnológico de Aguascalientes
2. Centro de Investigaciones en óptica, Unidad Aguascalientes
Abstract
We evaluated logistic regression as a classifier in the diagnosis of breast cancer based on Raman spectra. Common studies published in the subject use dimensional reduction techniques to generate the classifier. Instead, we proposed to observe the effect of using all intensity values recorded in the spectra as input variables to the algorithm. We used leaving one out cross-validation measuring classification accuracy, sensitivity and specificity. We used Raman spectra taken from breast tissue previously diagnosed by histopathological analysis, some from healthy tissue and some from tissue with cancer. Each spectrum is formed by 605 intensity values in the range of 687 to 1781 cm-1. Logistic regression classifier exhibited 100% classification accuracy. To establish comparative references, we evaluated in the same way: 1) a logistic model preceded by dimensional reduction with Principal Component Analysis (PCA+LR), 2) two classifiers obtained with weighted K nearest neighbors algorithm, and 3) a classifier using the naive Bayes (NB) algorithm. We found that PCA+LR and NB showed the same performance of 100% in classification accuracy. Nevertheless, PCA+LR requires more processing computational time.