BrumiR: A toolkit for de novo discovery of microRNAs from sRNA-seq data

Author:

Moraga Carol123ORCID,Sanchez Evelyn45ORCID,Ferrarini Mariana Galvão126ORCID,Gutierrez Rodrigo A578ORCID,Vidal Elena A459ORCID,Sagot Marie-France12ORCID

Affiliation:

1. Université de Lyon, Université Lyon 1, CNRS, Laboratoire de Biométrie et Biologie Evolutive UMR 5558 , F-69622 Villeurbanne, France

2. Inria Lyon Centre, ERABLE team, 56 Bd Niels Bohr , 69100 Villeurbanne, France

3. Universidad de O'Higgins, Instituto de Ciencias de la Ingeniería , 2820000 Rancagua, Chile

4. Centro de Genómica y Bioinformática, Facultad de Ciencias, Ingenieria y Tecnologia, Universidad Mayor , 8580745 Santiago, Chile

5. Agencia Nacional de Investigación y Desarrollo–Millennium Science Initiative Program, Millennium Institute for Integrative Biology iBio , 7500565 Santiago, Chile

6. Université de Lyon , INSA-Lyon, INRA, BF2i, UMR0203, Villeurbanne F-69621, France

7. Departamento de Genética Molecular y Microbiología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile , 8331010 Santiago, Chile

8. Fondo de Desarrollo de Areas Prioritarias, Center for Genome Regulation, Instituto de Ecología y Biodiversidad , 8370415 Santiago, Chile

9. Escuela de Biotecnología, Facultad de Ciencias, Ingenieria y Tecnologia, Universidad Mayor , 8580745 Santiago, Chile

Abstract

Abstract MicroRNAs (miRNAs) are small noncoding RNAs that are key players in the regulation of gene expression. In the past decade, with the increasing accessibility of high-throughput sequencing technologies, different methods have been developed to identify miRNAs, most of which rely on preexisting reference genomes. However, when a reference genome is absent or is not of high quality, such identification becomes more difficult. In this context, we developed BrumiR, an algorithm that is able to discover miRNAs directly and exclusively from small RNA (sRNA) sequencing (sRNA-seq) data. We benchmarked BrumiR with datasets encompassing animal and plant species using real and simulated sRNA-seq experiments. The results demonstrate that BrumiR reaches the highest recall for miRNA discovery, while at the same time being much faster and more efficient than the state-of-the-art tools evaluated. The latter allows BrumiR to analyze a large number of sRNA-seq experiments, from plants or animal species. Moreover, BrumiR detects additional information regarding other expressed sequences (sRNAs, isomiRs, etc.), thus maximizing the biological insight gained from sRNA-seq experiments. Additionally, when a reference genome is available, BrumiR provides a new mapping tool (BrumiR2reference) that performs an a posteriori exhaustive search to identify the precursor sequences. Finally, we also provide a machine learning classifier based on a random forest model that evaluates the sequence-derived features to further refine the prediction obtained from the BrumiR-core. The code of BrumiR and all the algorithms that compose the BrumiR toolkit are freely available at https://github.com/camoragaq/BrumiR.

Funder

Consejo Nacional de Innovación, Ciencia y Tecnología

Fondo Nacional de Desarrollo Científico y Tecnológico

Publisher

Oxford University Press (OUP)

Subject

Computer Science Applications,Health Informatics

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3