Abstract
AbstractThe need for molecular biomarkers for schizophrenia has been well recognized. Peripheral blood gene expression profiling and machine learning (ML) tools have recently become popular for biomarker discovery. The stigmatization associated with schizophrenia advocates the need for diagnostic models with higher precision. In this study, we propose a strategy to develop higher-precision ML models using ensemble learning. We performed a meta-analysis using peripheral blood expression microarray data. The ML models, support vector machines (SVM), and prediction analysis for microarrays (PAM) were developed using differentially expressed genes as features. The ensemble of SVM-radial and PAM predicted test samples with a precision of 81.33% (SD: 0.078). The precision of the ensemble model was significantly higher than SVM-radial (63.83%, SD: 0.081) and PAM (66.89%, SD: 0.097). The feature genes identified were enriched for biological processes such as response to stress, response to stimulus, regulation of the immune system, and metabolism of organic nitrogen compounds. The network analysis of feature genes identifiedPRF1, GZMB, IL2RB, ITGAL, andIL2RGas hub genes. Additionally, the ensemble model developed using microarray data classified the RNA-Sequencing samples with moderately high precision (72.00%, SD: 0.08). The pipeline developed in this study allows the prediction of a single microarray and RNA-Sequencing sample. In summary, this study developed robust models for clinical application and suggested ensemble learning for higher diagnostic precision in psychiatric disorders.Research highlightsEnsemble learning of Support Vector Machines (SVM) and Prediction Analysis for Microarrays (PAM) algorithms classified schizophrenia samples with higher precision.The pipeline developed in this analysis produced robust models with the ability to classify single microarray sample.Cross-platform validation of ensemble model using RNA-Sequencing data resulted in high precision.Graphical abstractBlood based SCZ diagnosis using ensemble learning for higher precision
Publisher
Cold Spring Harbor Laboratory
Reference50 articles.
1. American Psychiatric Association., 2013. Diagnostic and statistical manual of mental disorders., (5th ed.). ed. American Psychiatric Publishing.
2. Andrews, S. , 2010. FastQC: a quality control tool for high throughput sequence data. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
3. A comprehensive survey on computational learning methods for analysis of gene expression data;Front. Mol. Biosci,2022
4. Bolstad, B. , 2020. preprocessCore: A collection of pre-processing functions. R package version 1.50.0. https://github.com/bmbolstad/preprocessCore
5. Genenames.org: the HGNC and VGNC resources in 2019
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献