A high-dimensional classification approach based on class-dependent feature subspace
-
Published:2017-12-04
Issue:10
Volume:117
Page:2325-2339
-
ISSN:0263-5577
-
Container-title:Industrial Management & Data Systems
-
language:en
-
Short-container-title:IMDS
Author:
Chen Fuzan,Wu Harris,Dou Runliang,Li Minqiang
Abstract
Purpose
The purpose of this paper is to build a compact and accurate classifier for high-dimensional classification.
Design/methodology/approach
A classification approach based on class-dependent feature subspace (CFS) is proposed. CFS is a class-dependent integration of a support vector machine (SVM) classifier and associated discriminative features. For each class, our genetic algorithm (GA)-based approach evolves the best subset of discriminative features and SVM classifier simultaneously. To guarantee convergence and efficiency, the authors customize the GA in terms of encoding strategy, fitness evaluation, and genetic operators.
Findings
Experimental studies demonstrated that the proposed CFS-based approach is superior to other state-of-the-art classification algorithms on UCI data sets in terms of both concise interpretation and predictive power for high-dimensional data.
Research limitations/implications
UCI data sets rather than real industrial data are used to evaluate the proposed approach. In addition, only single-label classification is addressed in the study.
Practical implications
The proposed method not only constructs an accurate classification model but also obtains a compact combination of discriminative features. It is helpful for business makers to get a concise understanding of the high-dimensional data.
Originality/value
The authors propose a compact and effective classification approach for high-dimensional data. Instead of the same feature subset for all the classes, the proposed CFS-based approach obtains the optimal subset of discriminative feature and SVM classifier for each class. The proposed approach enhances both interpretability and predictive power for high-dimensional data.
Subject
Industrial and Manufacturing Engineering,Strategy and Management,Computer Science Applications,Industrial relations,Management Information Systems
Reference37 articles.
1. A two-stage gene selection scheme utilizing MRMR filter and GA wrapper;Knowledge and Information Systems,2011
2. Asuncion, A. and Newman, D.J. (2007), “UCI machine learning repository”, Department of Information and Computer Science, University of California, Irvine, CA, available at: www.ics.uci.edu/~mlearn/MLRepository.html
3. hGA: hybrid genetic algorithm in fuzzy rule-based classification systems for high-dimensional problems;Applied Soft Computing,2012
4. Principal association mining: an efficient classification approach;Knowledge-Based Systems,2014
5. Big data analytics with swarm intelligence;Industrial Management & Data Systems,2016
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献