Author:
Li Wei,Chi Yuhuan,Yu Kun,Xie Weidong
Abstract
Abstract
Background
In the field of genomics and personalized medicine, it is a key issue to find biomarkers directly related to the diagnosis of specific diseases from high-throughput gene microarray data. Feature selection technology can discover biomarkers with disease classification information.
Results
We use support vector machines as classifiers and use the five-fold cross-validation average classification accuracy, recall, precision and F1 score as evaluation metrics to evaluate the identified biomarkers. Experimental results show classification accuracy above 0.93, recall above 0.92, precision above 0.91, and F1 score above 0.94 on eight microarray datasets.
Method
This paper proposes a two-stage hybrid biomarker selection method based on ensemble filter and binary differential evolution incorporating binary African vultures optimization (EF-BDBA), which can effectively reduce the dimension of microarray data and obtain optimal biomarkers. In the first stage, we propose an ensemble filter feature selection method. The method combines an improved fast correlation-based filter algorithm with Fisher score. obviously redundant and irrelevant features can be filtered out to initially reduce the dimensionality of the microarray data. In the second stage, the optimal feature subset is selected using an improved binary differential evolution incorporating an improved binary African vultures optimization algorithm. The African vultures optimization algorithm has excellent global optimization ability. It has not been systematically applied to feature selection problems, especially for gene microarray data. We combine it with a differential evolution algorithm to improve population diversity.
Conclusion
Compared with traditional feature selection methods and advanced hybrid methods, the proposed method achieves higher classification accuracy and identifies excellent biomarkers while retaining fewer features. The experimental results demonstrate the effectiveness and advancement of our proposed algorithmic model.
Funder
the results of the research project funded by National key research and development program, china
Natural science Foundation of Liaoning Province under grant
Publisher
Springer Science and Business Media LLC
Subject
Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献