Affiliation:
1. School of Computer and Information Engineering, Henan University, Kaifeng, China
2. China Mobile Online Service Co. Ltd, China
3. College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, China
Abstract
Background:
Microarray data is widely utilized for disease analysis and diagnosis.
However, it is hard to process them directly and achieve high classification accuracy due to the
intrinsic characteristics of high dimensionality and small size samples. As an important data
preprocessing technique, feature selection is usually used to reduce the dimensionality of some
datasets.
Methods:
Given the limitations of employing filter or wrapper approaches individually for feature
selection, in the study, a novel hybrid filter-wrapper approach, CS_IFOA, is proposed for high
dimensional datasets. First, the Chi-square Test is utilized to filter out some irrelevant or redundant
features. Next, an improved binary Fruit Fly Optimization algorithm is conducted to further search
the optimal feature subset without degrading the classification accuracy. Here, the KNN classifier
with the 10-fold-CV is utilized to evaluate the classification accuracy.
Results:
Extensive experimental results on six benchmark biomedical datasets show that the
proposed CS-IFOA can achieve superior performance compared with other state-of-the-art
methods. The CS-IFOA can get a smaller number of features while achieving higher classification
accuracy. Furthermore, the standard deviation of the experimental results is relatively small, which
indicates that the proposed algorithm is relatively robust.
Conclusion:
The results confirmed the efficiency of our approach in identifying some important
genes for high-dimensional biomedical datasets, which can be used as an ideal pre-processing tool
to help optimize the feature selection process, and improve the efficiency of disease diagnosis.
Funder
Scientific Research Foundation of the Higher Education Institutions of Henan Province
China Postdoctoral Science Foundation
Science and Technology Development Plan Project of Henan Province
National Natural Science Foundation of China
Publisher
Bentham Science Publishers Ltd.
Subject
Computational Mathematics,Genetics,Molecular Biology,Biochemistry
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献