Abstract
AbstractThe role of microRNAs (miRNAs) in cellular processes captured the attention of many researchers, since their dysregulation is shown to affect the cancer disease landscape by sustaining proliferative signaling, evading program cell death, and inhibiting growth suppressors. Thus, miRNAs have been considered important diagnostic and prognostic biomarkers for several types of tumors. Machine learning algorithms have proven to be able to exploit the information contained in thousands of miRNAs to accurately predict and classify cancer types. Nevertheless, extracting the most relevant miRNA expressions is fundamental to allow human experts to validate and make sense of the results obtained by automatic algorithms. We propose a novel feature selection approach, able to identify the most important miRNAs for tumor classification, based on consensus on feature relevance from high-accuracy classifiers of different typologies. The proposed methodology is tested on a real-world dataset featuring 8,129 patients, 29 different types of tumors, and 1,046 miRNAs per patient, taken from The Cancer Genome Atlas (TCGA) database. A new miRNA signature is suggested, containing the 100 most important oncogenic miRNAs identified by the presented approach. Such a signature is proved to be sufficient to identify all 29 types of cancer considered in the study, with results nearly identical to those obtained using all 1,046 features in the original dataset. Subsequently, a meta-analysis of the medical literature is performed to find references to the most important biomarkers extracted by the methodology. Besides known oncomarkers, 15 new miRNAs previously not ranked as important biomarkers for diagnosis and prognosis in cancer pathologies are uncovered. Such miRNAs, considered relevant by the machine learning algorithms, but still relatively unexplored by specialized literature, could provide further insights in the biology of cancer.
Publisher
Cold Spring Harbor Laboratory
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献