Affiliation:
1. Department of Biostatistics, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
Abstract
The false discovery rate (FDR) is a widely used metric of statistical significance for genomic data analyses that involve multiple hypothesis testing. Power and sample size considerations are important in planning studies that perform these types of genomic data analyses. Here, we propose a three-rectangle approximation of a p-value histogram to derive a formula to compute the statistical power and sample size for analyses that involve the FDR. We also introduce the R package FDRsamplesize2, which incorporates these and other power calculation formulas to compute power for a broad variety of studies not covered by other FDR power calculation software. A few illustrative examples are provided. The FDRsamplesize2 package is available on CRAN.
Funder
American Lebanese Syrian Associated Charities
Reference24 articles.
1. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing;Benjamini;J. R. Stat. Soc. Ser. B,1995
2. False Discovery Rate;Storey;Int. Encycl. Stat. Sci.,2011
3. A Direct Approach to False Discovery Rates;Storey;J. R. Stat. Soc. Ser. B Stat. Methodol.,2002
4. Estimating the Number of True Null Hypotheses from a Histogram of p Values;Nettleton;J. Agric. Biol. Environ. Stat.,2006
5. Pounds, S.B., Gao, C.L., and Zhang, H. (2012). Empirical Bayesian Selection of Hypothesis Testing Procedures for Analysis of Sequence Count Expression Data. Stat. Appl. Genet. Mol. Biol., 11.