Affiliation:
1. Nanjing University of Information Science and Technology, Nanjing, Jiangsu, China
2. Victoria University of Wellington, Kelburn Parade, Wellington, New Zealand
Abstract
Many evolutionary computation (EC) methods have been used to solve feature selection problems and they perform well on most small-scale feature selection problems. However, as the dimensionality of feature selection problems increases, the solution space increases exponentially. Meanwhile, there are more irrelevant features than relevant features in datasets, which leads to many local optima in the huge solution space. Therefore, the existing EC methods still suffer from the problem of stagnation in local optima on large-scale feature selection problems. Furthermore, large-scale feature selection problems with different datasets may have different properties. Thus, it may be of low performance to solve different large-scale feature selection problems with an existing EC method that has only one candidate solution generation strategy (CSGS). In addition, it is time-consuming to find a suitable EC method and corresponding suitable parameter values for a given large-scale feature selection problem if we want to solve it effectively and efficiently. In this article, we propose a self-adaptive particle swarm optimization (SaPSO) algorithm for feature selection, particularly for large-scale feature selection. First, an encoding scheme for the feature selection problem is employed in the SaPSO. Second, three important issues related to self-adaptive algorithms are investigated. After that, the SaPSO algorithm with a typical self-adaptive mechanism is proposed. The experimental results on 12 datasets show that the solution size obtained by the SaPSO algorithm is smaller than its EC counterparts on all datasets. The SaPSO algorithm performs better than its non-EC and EC counterparts in terms of classification accuracy not only on most training sets but also on most test sets. Furthermore, as the dimensionality of the feature selection problem increases, the advantages of SaPSO become more prominent. This highlights that the SaPSO algorithm is suitable for solving feature selection problems, particularly large-scale feature selection problems.
Funder
National Natural Science Foundation of China
Priority Academic Program Development of Jiangsu Higher Education Institutions
Natural Science Foundation of Jiangsu Province
Natural Science Foundation of the Jiangsu Higher Education Institutions of China
Publisher
Association for Computing Machinery (ACM)
Reference46 articles.
1. The precise consistency consensus matrix in a local AHP-group decision making context
2. Feature subset selection using differential evolution and a wheel based search strategy
3. K. Bache and M. Lichman. 2016. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml/index.php. K. Bache and M. Lichman. 2016. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml/index.php.
4. Opposition chaotic fitness mutation based adaptive inertia weight BPSO for feature selection in text clustering
5. X. J. Chang F. P. Nie Y. Yang C. Q. Zhang and H. Huang. 2016. Convex sparse PCA for unsupervised feature learning. ACM Transactions on Knowledge Discovery from Data 11 1 (2016) 16. X. J. Chang F. P. Nie Y. Yang C. Q. Zhang and H. Huang. 2016. Convex sparse PCA for unsupervised feature learning. ACM Transactions on Knowledge Discovery from Data 11 1 (2016) 16.
Cited by
288 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献