A stable gene selection in microarray data analysis-Reference-Cited by-同舟云学术

A stable gene selection in microarray data analysis

Published:2006-04-27 Issue:1 Volume:7 Page:
ISSN:1471-2105
Container-title:BMC Bioinformatics
language:en
Short-container-title:BMC Bioinformatics

Author:

Yang Kun,Cai Zhipeng,Li Jianzhong,Lin Guohui

Abstract

Abstract Background Microarray data analysis is notorious for involving a huge number of genes compared to a relatively small number of samples. Gene selection is to detect the most significantly differentially expressed genes under different conditions, and it has been a central research focus. In general, a better gene selection method can improve the performance of classification significantly. One of the difficulties in gene selection is that the numbers of samples under different conditions vary a lot. Results Two novel gene selection methods are proposed in this paper, which are not affected by the unbalanced sample class sizes and do not assume any explicit statistical model on the gene expression values. They were evaluated on eight publicly available microarray datasets, using leave-one-out cross-validation and 5-fold cross-validation. The performance is measured by the classification accuracies using the top ranked genes based on the training datasets. Conclusion The experimental results showed that the proposed gene selection methods are efficient, effective, and robust in identifying differentially expressed genes. Adopting the existing SVM-based and KNN-based classifiers, the selected genes by our proposed methods in general give more accurate classification results, typically when the sample class sizes in the training dataset are unbalanced.

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology

Link

https://link.springer.com/content/pdf/10.1186/1471-2105-7-228.pdf

Reference19 articles.

1. Dudoit S, Fridlyand J, Speed TP: Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. Journal of the American Statistical Association 2002, 97: 77–87.

2. Xiong M, Fang X, Zhao J: Biomarker Identification by Feature Wrappers. Genome Research 2001, 11: 1878–1887.

3. Mukherjee S, Roberts SJ: A Theoretical Analysis of Gene Selection. Proceedings of IEEE Computer Society Bioinformatics Conference (CSB 2004) 2004, 131–141.

4. Baldi P, Long AD: A Bayesian Framework for the Analysis of Microarray Expression Data: Regularized t-test and Statistical Inferences of Gene Changes. Bioinformatics 2001, 17: 509–519.

5. Guyon I, Weston J, Barnhill S, Vapnik V: Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning 2002, 46: 389–422.

Cited by 145 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. OSFS‐Vague: Online streaming feature selection algorithm based on vague set;CAAI Transactions on Intelligence Technology;2024-04-08

2. Gene selection for high dimensional biological datasets using hybrid island binary artificial bee colony with chaos game optimization;Artificial Intelligence Review;2024-02-13

3. Intelligent Computing Approach in Gene Evaluation for Carcinogenic Disease Detection;Computational Intelligence Methods and Applications;2024

4. Intelligent Computing Approaches for Carcinogenic Disease Detection: A Review;Computational Intelligence Methods and Applications;2024

5. Gene selection for microarray data classification based on mutual information and binary whale optimization algorithm;Handbook of Whale Optimization Algorithm;2024