Cooperative co-evolution for feature selection in Big Data with random feature grouping-Reference-Cited by-同舟云学术

Cooperative co-evolution for feature selection in Big Data with random feature grouping

Published:2020-12 Issue:1 Volume:7 Page:
ISSN:2196-1115
Container-title:Journal of Big Data
language:en
Short-container-title:J Big Data

Author:

Rashid A. N. M. Bazlur^ORCID,Ahmed Mohiuddin,Sikos Leslie F.,Haskell-Dowland Paul

Abstract

AbstractA massive amount of data is generated with the evolution of modern technologies. This high-throughput data generation results in Big Data, which consist of many features (attributes). However, irrelevant features may degrade the classification performance of machine learning (ML) algorithms. Feature selection (FS) is a technique used to select a subset of relevant features that represent the dataset. Evolutionary algorithms (EAs) are widely used search strategies in this domain. A variant of EAs, called cooperative co-evolution (CC), which uses a divide-and-conquer approach, is a good choice for optimization problems. The existing solutions have poor performance because of some limitations, such as not considering feature interactions, dealing with only an even number of features, and decomposing the dataset statically. In this paper, a novel random feature grouping (RFG) has been introduced with its three variants to dynamically decompose Big Data datasets and to ensure the probability of grouping interacting features into the same subcomponent. RFG can be used in CC-based FS processes, hence called Cooperative Co-Evolutionary-Based Feature Selection with Random Feature Grouping (CCFSRFG). Experiment analysis was performed using six widely used ML classifiers on seven different datasets from the UCI ML repository and Princeton University Genomics repository with and without FS. The experimental results indicate that in most cases [i.e., with naïve Bayes (NB), support vector machine (SVM), k-Nearest Neighbor (k-NN), J48, and random forest (RF)] the proposed CCFSRFG-1 outperforms an existing solution (a CC-based FS, called CCEAFS) and CCFSRFG-2, and also when using all features in terms of accuracy, sensitivity, and specificity.

Publisher

Springer Science and Business Media LLC

Subject

Information Systems and Management,Computer Networks and Communications,Hardware and Architecture,Information Systems

Link

http://link.springer.com/content/pdf/10.1186/s40537-020-00381-y.pdf

Reference99 articles.

1. Rashid ANMB. Access methods for Big Data: current status and future directions. EAI Endorsed Trans Scalable Inf Syst. 2018. https://doi.org/10.4108/eai.28-12-2017.153520.

2. Chakraborty B, Kawamura A. A new penalty-based wrapper fitness function for feature subset selection with evolutionary algorithms. J Inf Telecommun. 2018;2(2):163–80. https://doi.org/10.1080/24751839.2018.1423792.

3. Khalid S, Khalil T, Nasreen S. A survey of feature selection and feature extraction techniques in machine learning. In: 2014 science and information conference. 2014. p. 372–8 . https://doi.org/10.1109/SAI.2014.6918213.

4. Miao J, Niu L. A survey on feature selection. Procedia Comput Sci. 2016;91:919–26. https://doi.org/10.1016/j.procs.2016.07.111.

5. Rashid AB, Choudhury T. Knowledge management overview of feature selection problem in high-dimensional financial data: cooperative co-evolution and MapReduce perspectives. Probl Perspect Manag. 2019;17(4):340. https://doi.org/10.21511/ppm.17(4).2019.28.

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Survey on Evolutionary Deep Learning: Principles, Algorithms, Applications, and Open Issues;ACM Computing Surveys;2023-09-15

2. Discarding–Recovering and Co-Evolution Mechanisms Based Evolutionary Algorithm for Hyperspectral Feature Selection;Remote Sensing;2023-07-30

3. Review of feature selection approaches based on grouping of features;PeerJ;2023-07-17

4. Adaptive cooperative coevolutionary differential evolution for parallel feature selection in high-dimensional datasets;The Journal of Supercomputing;2023-04-16

5. Wrapper-based optimized feature selection using nature-inspired algorithms;Neural Computing and Applications;2023-03-06