Aptardi predicts polyadenylation sites in sample-specific transcriptomes using high-throughput RNA sequencing and DNA sequence-Reference-Cited by-同舟云学术

Aptardi predicts polyadenylation sites in sample-specific transcriptomes using high-throughput RNA sequencing and DNA sequence

Published:2021-03-12 Issue:1 Volume:12 Page:
ISSN:2041-1723
Container-title:Nature Communications
language:en
Short-container-title:Nat Commun

Author:

Lusk Ryan^ORCID,Stene Evan^ORCID,Banaei-Kashani Farnoush,Tabakoff Boris,Kechris Katerina,Saba Laura M.^ORCID

Abstract

AbstractAnnotation of polyadenylation sites from short-read RNA sequencing alone is a challenging computational task. Other algorithms rooted in DNA sequence predict potential polyadenylation sites; however, in vivo expression of a particular site varies based on a myriad of conditions. Here, we introduce aptardi (alternative polyadenylation transcriptome analysis from RNA-Seq data and DNA sequence information), which leverages both DNA sequence and RNA sequencing in a machine learning paradigm to predict expressed polyadenylation sites. Specifically, as input aptardi takes DNA nucleotide sequence, genome-aligned RNA-Seq data, and an initial transcriptome. The program evaluates these initial transcripts to identify expressed polyadenylation sites in the biological sample and refines transcript 3′-ends accordingly. The average precision of the aptardi model is twice that of a standard transcriptome assembler. In particular, the recall of the aptardi model (the proportion of true polyadenylation sites detected by the algorithm) is improved by over three-fold. Also, the model—trained using the Human Brain Reference RNA commercial standard—performs well when applied to RNA-sequencing samples from different tissues and different mammalian species. Finally, aptardi’s input is simple to compile and its output is easily amenable to downstream analyses such as quantitation and differential expression.

Funder

U.S. Department of Health & Human Services | NIH | National Institute on Alcohol Abuse and Alcoholism

U.S. Department of Health & Human Services | NIH | National Institute on Drug Abuse

Publisher

Springer Science and Business Media LLC

Subject

General Physics and Astronomy,General Biochemistry, Genetics and Molecular Biology,General Chemistry

Link

http://www.nature.com/articles/s41467-021-21894-x.pdf

Reference98 articles.

1. Di Giammartino, D. C., Nishida, K. & Manley, J. L. Mechanisms and consequences of alternative polyadenylation. Mol. Cell 43, 853–866 (2011).

2. Tian, B. & Manley, J. L. Alternative polyadenylation of mRNA precursors. Nat. Rev. Mol. Cell Biol. 18, 18–30 (2017).

3. Park, J. Y. et al. Comparative analysis of mRNA isoform expression in cardiac hypertrophy and development reveals multiple post-transcriptional regulatory modules. PLoS ONE 6, e22391 (2011).

4. de Klerk, E. et al. Poly(A) binding protein nuclear 1 levels affect alternative polyadenylation. Nucleic Acids Res. 40, 9089–9101 (2012).

5. Jenal, M. et al. The poly(A)-binding protein nuclear 1 suppresses alternative cleavage and polyadenylation sites. Cell 149, 538–553 (2012).

Cited by 23 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Characteristics of inter-root soil bacterial community structure and diversity of different sand-fixing shrubs at the southeastern edge of the Mu Us Desert, China;Annals of Microbiology;2024-08-08

2. Big data and deep learning for RNA biology;Experimental & Molecular Medicine;2024-06-14

3. Gene regulation via RNA isoform variations;Beyond the Blueprint - Decoding the Elegance of Gene Expression [Working Title];2024-05-24

4. InPACT: a computational method for accurate characterization of intronic polyadenylation from RNA sequencing data;Nature Communications;2024-03-22

5. TDP-43 loss induces extensive cryptic polyadenylation in ALS/FTD;2024-01-23