Powerful and interpretable control of false discoveries in two-group differential expression studies
Author:
Enjalbert-Courrech Nicolas1,
Neuvial Pierre1ORCID
Affiliation:
1. Institut de Mathématiques de Toulouse, UMR 5219, Université de Toulouse, CNRS, UPS , F-31062 Toulouse Cedex 9, France
Abstract
Abstract
Motivation
The standard approach for statistical inference in differential expression (DE) analyses is to control the false discovery rate (FDR). However, controlling the FDR does not in fact imply that the proportion of false discoveries is upper bounded. Moreover, no statistical guarantee can be given on subsets of genes selected by FDR thresholding. These known limitations are overcome by post hoc inference, which provides guarantees of the number of proportion of false discoveries among arbitrary gene selections. However, post hoc inference methods are not yet widely used for DE studies.
Results
In this article, we demonstrate the relevance and illustrate the performance of adaptive interpolation-based post hoc methods for two-group DE studies. First, we formalize the use of permutation-based methods to obtain sharp confidence bounds that are adaptive to the dependence between genes. Then, we introduce a generic linear time algorithm for computing post hoc bounds, making these bounds applicable to large-scale two-group DE studies. The use of the resulting Adaptive Simes bound is illustrated on a RNA sequencing study. Comprehensive numerical experiments based on real microarray and RNA sequencing data demonstrate the statistical performance of the method.
Availability and implementation
A cross-platform open source implementation within the R package sanssouci is available at https://sanssouci-org.github.io/sanssouci/.
Supplementary information
Supplementary data are available at Bioinformatics online.
Funder
Fondation Catalyses at Université Paul Sabatier
Mission for Transversal and Interdisciplinary Initiatives
Publisher
Oxford University Press (OUP)
Subject
Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability
Reference29 articles.
1. Controlling the false discovery rate: a practical and powerful approach to multiple testing;Benjamini;J. R. Stat. Soc. Ser. B (Methodological),1995
2. The control of the false discovery rate in multiple testing under dependency;Benjamini;Ann. Stat,2001
3. Notip: non-parametric true discovery proportion control for brain imaging;Blain;Neuroimage,2022
4. Post hoc confidence bounds on false positives using reference families;Blanchard;Ann. Stat,2020
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献