Rediscover: an R package to identify mutually exclusive mutations

Author:

Ferrer-Bonsoms Juan A1ORCID,Jareno Laura1,Rubio Angel1ORCID

Affiliation:

1. Department of Biomedical Engineering and Sciences, TECNUN, University of Navarra, San Sebastian, Spain

Abstract

Abstract Motivation Discover is an algorithm developed to identify mutually exclusive genomic events. Its main contribution is a statistical analysis based on the Poisson–Binomial (PB) distribution to take into account the mutation rate of genes and samples. Discover is very effective for identifying mutually exclusive mutations at the expense of speed in large datasets: the PB is computationally costly to estimate, and checking all the potential mutually exclusive alterations requires millions of tests. Results We have implemented a new version of the package called Rediscover that implements exact and approximate computations of the PB. Rediscover exact implementation is slightly faster than Discover for large and medium-sized datasets. The approximation is 100–1000 times faster for them making it possible to get results in less than a minute with a standard desktop. The memory footprint is also smaller in Rediscover. The new package is available at CRAN and provides some functions to integrate its usage with other R packages such as maftools and TCGAbiolinks. Availability and implementation Rediscover is available at CRAN (https://cran.r-project.org/web/packages/Rediscover/index.html). Supplementary information Supplementary data are available at Bioinformatics online.

Funder

Editor project (Cancer Research UK

AECC and AIRC under the Accelerator Award Programme)

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Reference9 articles.

1. An approximation theorem for the Poisson binomial distribution;Cam,1960

2. A novel independence test for somatic alterations in cancer shows that biology drives mutual exclusivity but chance explains most co-occurrence;Canisius;Genome Biol,2016

3. Fitting linear models and generalized linear models with large data sets in R;Enea;Stat. Methods Anal. Large Datasets Book Short Papers,2009

4. Maftools: efficient and comprehensive analysis of somatic variants in cancer;Mayakonda;Genome Res,2018

5. New functionalities in the TCGAbiolinks package for the study and integration of cancer data from GDC and GTEX;Mounir;PLoS Comput. Biol,2019

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3