ExplorATE: a new pipeline to explore active transposable elements from RNA-seq data

Author:

Femenias Martin M1ORCID,Santos Juan C2,Sites Jack W3,Avila Luciano J1,Morando Mariana1

Affiliation:

1. Consejo Nacional de Investigaciones Científicas y Técnicas, Instituto Patagónico para el Estudio de los Ecosistemas Continentales (IPEEC-CONICET) , Puerto Madryn, CT U9120ACD, Argentina

2. Department of Biological Sciences, St. John's University , Queens, NY 11439, USA

3. Department of Biology and M.L. Bean Life Science Museum, Brigham Young University (BYU) , Provo, UT 84602, USA

Abstract

Abstract Motivation Transposable elements (TEs) are ubiquitous in genomes and many remain active. TEs comprise an important fraction of the transcriptomes with potential effects on the host genome, either by generating deleterious mutations or promoting evolutionary novelties. However, their functional study is limited by the difficulty in their identification and quantification, particularly in non-model organisms. Results We developed a new pipeline [explore active transposable elements (ExplorATE)] implemented in R and bash that allows the quantification of active TEs in both model and non-model organisms. ExplorATE creates TE-specific indexes and uses the Selective Alignment (SA) to filter out co-transcribed transposons within genes based on alignment scores. Moreover, our software incorporates a Wicker-like criteria to refine a set of target TEs and avoid spurious mapping. Based on simulated and real data, we show that the SA strategy adopted by ExplorATE achieved better estimates of non-co-transcribed elements than other available alignment-based or mapping-based software. ExplorATE results showed high congruence with alignment-based tools with and without a reference genome, yet ExplorATE required less execution time. Likewise, ExplorATE expands and complements most previous TE analyses by incorporating the co-transcription and multi-mapping effects during quantification, and provides a seamless integration with other downstream tools within the R environment. Availability and implementation Source code is available at https://github.com/FemeniasM/ExplorATEproject and https://github.com/FemeniasM/ExplorATE_shell_script. Data available on request. Supplementary information Supplementary data are available at Bioinformatics online.

Funder

Fondo para la Investigación Científica y Tecnológica

National Science Foundation

John's University

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Reference44 articles.

1. Double-stranded RNA-mediated silencing of genomic tandem repeats and transposable elements in the D. melanogaster germline;Aravin;Curr. Biol,2001

2. A transposable element annotation pipeline and expression analysis reveal potentially active elements in the microalga Tisochrysis lutea;Berthelier;BMC Genomics,2018

3. Evolution of the mammalian transcription factor binding repertoire via transposable elements;Bourque;Genome Res,2008

4. Comparative analysis of transposable elements highlights mobilome diversity and evolution in vertebrates;Chalopin;Genome Biol. Evol,2015

5. Transcriptome analyses of tumor-adjacent somatic tissues reveal genes co-expressed with transposable elements;Chung;Mob. DNA,2019

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3