Affiliation:
1. Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University , Wuhan 430070, People’s Republic of China
2. Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University , Wuhan 430070, People’s Republic of China
Abstract
Abstract
Motivation
Classification of samples using biomedical omics data is a widely used method in biomedical research. However, these datasets often possess challenging characteristics, including high dimensionality, limited sample sizes, and inherent biases across diverse sources. These factors limit the performance of traditional machine learning models, particularly when applied to independent datasets.
Results
To address these challenges, we propose a novel classifier, Deep Centroid, which combines the stability of the nearest centroid classifier and the strong fitting ability of the deep cascade strategy. Deep Centroid is an ensemble learning method with a multi-layer cascade structure, consisting of feature scanning and cascade learning stages that can dynamically adjust the training scale. We apply Deep Centroid to three precision medicine applications—cancer early diagnosis, cancer prognosis, and drug sensitivity prediction—using cell-free DNA fragmentations, gene expression profiles, and DNA methylation data. Experimental results demonstrate that Deep Centroid outperforms six traditional machine learning models in all three applications, showcasing its potential in biological omics data classification. Furthermore, functional annotations reveal that the features scanned by the model exhibit biological significance, indicating its interpretability from a biological perspective. Our findings underscore the promising application of Deep Centroid in the classification of biomedical omics data, particularly in the field of precision medicine.
Availability and implementation
Deep Centroid is available at both github (github.com/xiexiexiekuan/DeepCentroid) and Figshare (https://figshare.com/articles/software/Deep_Centroid_A_General_Deep_Cascade_Classifier_for_Biomedical_Omics_Data_Classification/24993516).
Funder
Fundamental Research Funds for the Central Universities
Publisher
Oxford University Press (OUP)