An Ensemble Classification Method for High-Dimensional Data Using Neighborhood Rough Set-Reference-Cited by-同舟云学术

An Ensemble Classification Method for High-Dimensional Data Using Neighborhood Rough Set

Published:2021-11-24 Issue: Volume:2021 Page:1-12
ISSN:1099-0526
Container-title:Complexity
language:en
Short-container-title:Complexity

Author:

Zhang Jing¹,Lu Guang¹^ORCID,Li Jiaquan¹^ORCID,Li Chuanwen¹^ORCID

Affiliation:

1. School of Computer Science and Engineering, Northeastern University, Shenyang 110004, China

Abstract

Mining useful knowledge from high-dimensional data is a hot research topic. Efficient and effective sample classification and feature selection are challenging tasks due to high dimensionality and small sample size of microarray data. Feature selection is necessary in the process of constructing the model to reduce time and space consumption. Therefore, a feature selection model based on prior knowledge and rough set is proposed. Pathway knowledge is used to select feature subsets, and rough set based on intersection neighborhood is then used to select important feature in each subset, since it can select features without redundancy and deals with numerical features directly. In order to improve the diversity among base classifiers and the efficiency of classification, it is necessary to select part of base classifiers. Classifiers are grouped into several clusters by k-means clustering using the proposed combination distance of Kappa-based diversity and accuracy. The base classifier with the best classification performance in each cluster will be selected to generate the final ensemble model. Experimental results on three Arabidopsis thaliana stress response datasets showed that the proposed method achieved better classification performance than existing ensemble models.

Funder

National Natural Science Foundation of China

Publisher

Hindawi Limited

Subject

Multidisciplinary,General Computer Science

Link

http://downloads.hindawi.com/journals/complexity/2021/8358921.pdf

Reference42 articles.

1. On fuzzy-rough attribute selection: Criteria of Max-Dependency, Max-Relevance, Min-Redundancy, and Max-Significance

2. Rough-Fuzzy Clustering for Grouping Functionally Similar Genes from Microarray Data

3. Regularized logistic regression with adjusted adaptive elastic net for gene selection in high dimensional cancer classification

4. Incorporating logistic regression to decision-theoretic rough sets for classifications

5. A two-stage hybrid ant colony optimization for high-dimensional feature selection

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Statistical method for clustering high-dimensional data based on fuzzy mathematical modeling;Applied Mathematics and Nonlinear Sciences;2023-12-11

2. Feature selection based on self-information and entropy measures for incomplete neighborhood decision systems;Complex & Intelligent Systems;2022-10-11