CMA – a comprehensive Bioconductor package for supervised classification with high dimensional data-Reference-Cited by-同舟云学术

CMA – a comprehensive Bioconductor package for supervised classification with high dimensional data

Published:2008-10-16 Issue:1 Volume:9 Page:
ISSN:1471-2105
Container-title:BMC Bioinformatics
language:en
Short-container-title:BMC Bioinformatics

Author:

Slawski M,Daumer M,Boulesteix A-L

Abstract

Abstract Background For the last eight years, microarray-based classification has been a major topic in statistics, bioinformatics and biomedicine research. Traditional methods often yield unsatisfactory results or may even be inapplicable in the so-called "p ≫ n" setting where the number of predictors p by far exceeds the number of observations n, hence the term "ill-posed-problem". Careful model selection and evaluation satisfying accepted good-practice standards is a very complex task for statisticians without experience in this area or for scientists with limited statistical background. The multiplicity of available methods for class prediction based on high-dimensional data is an additional practical challenge for inexperienced researchers. Results In this article, we introduce a new Bioconductor package called CMA (standing for "C lassification for M icroA rrays") for automatically performing variable selection, parameter tuning, classifier construction, and unbiased evaluation of the constructed classifiers using a large number of usual methods. Without much time and effort, users are provided with an overview of the unbiased accuracy of most top-performing classifiers. Furthermore, the standardized evaluation framework underlying CMA can also be beneficial in statistical research for comparison purposes, for instance if a new classifier has to be compared to existing approaches. Conclusion CMA is a user-friendly comprehensive package for classifier construction and evaluation implementing most usual approaches. It is freely available from the Bioconductor website at http://bioconductor.org/packages/2.3/bioc/html/CMA.html.

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology

Link

https://link.springer.com/content/pdf/10.1186/1471-2105-9-439.pdf

Reference60 articles.

1. Ihaka R, Gentleman R: R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics 1996, 5: 299–314. 10.2307/1390807

2. Gentleman R, Carey J, Bates D, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JYH, Zhang J: Bioconductor: Open software development for computational biology and bioinformatics. Genome Biology 2004, 5: R80. 10.1186/gb-2004-5-10-r80

3. Tibshirani R, Hastie T, Narasimhan B, Chu G: Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proceedings of the National Academy of Sciences 2002, 99: 6567–6572. 10.1073/pnas.082099299

4. Breiman L: Random Forests. Machine Learning 2001, 45: 5–32. 10.1023/A:1010933404324

5. Boulesteix AL, Strimmer K: Partial Least Squares: A versatile tool for the analysis of high-dimensional genomic data. Briefings in Bioinformatics 2007, 8: 32–44. 10.1093/bib/bbl016

Cited by 81 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Differential Expression, Functional and Machine Learning Analysis of High-Throughput –Omics Data Using Open-Source Tools;Methods in Molecular Biology;2022-11-24

2. A novel cross-species differential tumor classification method based on exosome-derived microRNA biomarkers established by human-dog lymphoid and mammary tumor cell lines' transcription profiles;Veterinary World;2022-05-11

3. SARS-CoV-2 infection and acute ischemic stroke in Lombardy, Italy;Journal of Neurology;2021-05-24

4. Classification accuracy of TMS for the diagnosis of mild cognitive impairment;Brain Stimulation;2021-03

5. Comparative validation of the BOADICEA and Tyrer-Cuzick breast cancer risk models incorporating classical risk factors and polygenic risk in a population-based prospective cohort of women of European ancestry;Breast Cancer Research;2021-02-15