Unification of miRNA and isomiR research: the mirGFF3 format and the mirtop API

Author:

Desvignes Thomas1ORCID,Loher Phillipe2,Eilbeck Karen3,Ma Jeffery2,Urgese Gianvito4,Fromm Bastian5ORCID,Sydes Jason1ORCID,Aparicio-Puerta Ernesto6,Barrera Victor7,Espín Roderic8,Thibord Florian910,Bofill-De Ros Xavier11,Londin Eric2,Telonis Aristeidis G2,Ficarra Elisa4,Friedländer Marc R5,Postlethwait John H1ORCID,Rigoutsos Isidore2,Hackenberg Michael6,Vlachos Ioannis S12,Halushka Marc K13ORCID,Pantano Lorena14ORCID

Affiliation:

1. Institute of Neuroscience, University of Oregon, Eugene, OR 97403, USA

2. Computational Medicine Center, Thomas Jefferson University, Philadelphia, PA 19144, USA

3. University of Utah, Biomedical Informatics, Salt Lake City, UT 84108, USA

4. Department of Control and Computer Engineering, Politecnico di Torino, Torino 10129, Italy

5. Science for Life Laboratory, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm 114 18, Sweden

6. Computational Epigenomics Laboratory, Genetics Department and Biotechnology Institute and Biosanitary Institute, University of Granada, Granada 18002, Spain

7. Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA

8. Universitat Oberta de Catalunya, Barcelona 08018, Spain

9. Sorbonne Université, Pierre Louis Doctoral School of Public Health, Paris 75006, France

10. Institut National pour la Santé et la Recherche Médicale (INSERM) Unité Mixte de Recherche en Santé (UMR_S), University of Bordeaux, Bordeaux 33076, France

11. RNA Biology Laboratory, Center for Cancer Research, National Cancer Institute, Frederick, MD 21702, USA

12. Non-coding Research Lab, Department of Pathology, Cancer Research Institute, Harvard Medical School Initiative for RNA Medicine, Beth Israel Deaconess Medical Center, Boston, MA 02115, USA

13. Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA

14. Bioinformatics Core, The Picower Institute for Learning and Memory, Cambridge, MA 02139, USA

Abstract

Abstract Motivation MicroRNAs (miRNAs) are small RNA molecules (∼22 nucleotide long) involved in post-transcriptional gene regulation. Advances in high-throughput sequencing technologies led to the discovery of isomiRs, which are miRNA sequence variants. While many miRNA-seq analysis tools exist, the diversity of output formats hinders accurate comparisons between tools and precludes data sharing and the development of common downstream analysis methods. Results To overcome this situation, we present here a community-based project, miRNA Transcriptomic Open Project (miRTOP) working towards the optimization of miRNA analyses. The aim of miRTOP is to promote the development of downstream isomiR analysis tools that are compatible with existing detection and quantification tools. Based on the existing GFF3 format, we first created a new standard format, mirGFF3, for the output of miRNA/isomiR detection and quantification results from small RNA-seq data. Additionally, we developed a command line Python tool, mirtop, to create and manage the mirGFF3 format. Currently, mirtop can convert into mirGFF3 the outputs of commonly used pipelines, such as seqbuster, isomiR-SEA, sRNAbench, Prost! as well as BAM files. Some tools have also incorporated the mirGFF3 format directly into their code, such as, miRge2.0, IsoMIRmap and OptimiR. Its open architecture enables any tool or pipeline to output or convert results into mirGFF3. Collectively, this isomiR categorization system, along with the accompanying mirGFF3 and mirtop API, provide a comprehensive solution for the standardization of miRNA and isomiR annotation, enabling data sharing, reporting, comparative analyses and benchmarking, while promoting the development of common miRNA methods focusing on downstream steps of miRNA detection, annotation and quantification. Availability and implementation https://github.com/miRTop/mirGFF3/ and https://github.com/miRTop/mirtop. Supplementary information Supplementary data are available at Bioinformatics online.

Funder

National Science Foundation

Strategic Research Area

Swedish Research Council

National Institutes of Health

National Heart Lung Blood Institute

GENMED laboratory of excellence on medical genomics

George and Marie Vergottis Fellowship of Harvard Medical School

NIH

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Reference67 articles.

1. sRNAbench and sRNAtoolbox 2019: intuitive fast small RNA profiling and differential expression;Aparicio-Puerta;Nucleic Acids Res,2019

2. The role of microRNAs in human diseases;Ardekani;Avicenna J. Med. Biotechnol,2010

3. miRCarta: a central repository for collecting miRNA candidates;Backes;Nucleic Acids Res,2018

4. MicroRNAs: genomics, biogenesis, mechanism, and function;Bartel;Cell,2004

5. Metazoan microRNAs;Bartel;Cell,2018

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3