Affiliation:
1. School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai, 201620, China
Abstract
Objective:
Gene expression profile data is a good data source for people to study
tumors, but gene expression data has the characteristics of high dimension and redundancy.
Therefore, gene selection is a very important step in microarray data classification.
Method:
In this paper, a feature selection method based on the maximum mutual information coefficient
and graph theory is proposed. Each feature of gene expression data is treated as a vertex
of the graph, and the maximum mutual information coefficient between genes is used to measure
the relationship between the vertices to construct an undirected graph, and then the core and coritivity
theory is used to determine the feature subset of gene data.
Results:
In this work, we used three different classification models and three different evaluation
metrics such as accuracy, F1-Score, and AUC to evaluate the classification performance to avoid
reliance on any one classifier or evaluation metric. The experimental results on six different types
of genetic data show that our proposed algorithm has high accuracy and robustness compared to
other advanced feature selection methods.
Conclusion:
In this method, the importance and correlation of features are considered at the same
time, and the problem of gene selection in microarray data classification is solved.
Publisher
Bentham Science Publishers Ltd.
Subject
Organic Chemistry,Computer Science Applications,Drug Discovery,General Medicine
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Combinatorial Study of Chemical Graphs;Combinatorial Chemistry & High Throughput Screening;2024-03