Abstract
Abstract
Background
Feature selection is a crucial step in machine learning analysis. Currently, many feature selection approaches do not ensure satisfying results, in terms of accuracy and computational time, when the amount of data is huge, such as in ‘Omics’ datasets.
Results
Here, we propose an innovative implementation of a genetic algorithm, called GARS, for fast and accurate identification of informative features in multi-class and high-dimensional datasets. In all simulations, GARS outperformed two standard filter-based and two ‘wrapper’ and one embedded’ selection methods, showing high classification accuracies in a reasonable computational time.
Conclusions
GARS proved to be a suitable tool for performing feature selection on high-dimensional data. Therefore, GARS could be adopted when standard feature selection approaches do not provide satisfactory results or when there is a huge amount of data to be analyzed.
Publisher
Springer Science and Business Media LLC
Subject
Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology
Reference35 articles.
1. Kourou K, Exarchos TP, Exarchos KP, Karamouzis MV, Fotiadis DI. Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J. Elsevier. 2015;13:8–17.
2. Antman EM, Loscalzo J. Precision medicine in cardiology. Nat Rev Cardiol. Nat Publ Group. 2016;13:591.
3. Wang L, Chu F, Xie W. Accurate cancer classification using expressions of very few genes. IEEE/ACM Trans Comput Biol Bioinforma. 2007;4(1):40–53.
4. Bolón-Canedo V, Sánchez-Maroño N. Alonso-Betanzos A. Prog Artif Intell: Feature selection for high-dimensional data; 2016.
5. Saeys Y, Inza I, Larrañaga P. A review of feature selection techniques in bioinformatics. Bioinformatics. Oxford University Press; 2007;23:2507–2517.
Cited by
30 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献