Abstract
AbstractBackgroundOver the last decade the drop in short-read sequencing costs have allowed experimental techniques utilizing sequencing to address specific biological questions to proliferate, oftentimes outpacing standardized or effective analysis approaches for the data generated. There are growing amounts of bacterial 3’-end sequencing data, yet there is currently no commonly accepted analysis methodology for this datatype. Most data analysis approaches are somewhatad hocand, despite the presence of substantial signal within annotated genes, focus on genomic regions outside the annotated genes (e.g. 3’ or 5’ UTRs). Furthermore, lack of a systematic approach to analyzing such data makes it impossible to compare conclusions generated by different labs, using different organisms.ResultsWe present PIPETS (PoissonIdentification ofPEaks fromTerm-Seq data), an R package available on Bioconductor that provides a novel analysis method for 3’-end sequencing data. PIPETS is a statistically informed, gene-annotation agnostic methodology. Across two different datasets PIPETS identified significant 3’-end termination signal across a wider range of annotated genomic contexts than existing analysis approaches, suggesting that existing approaches may miss biologically relevant signal. Furthermore, assessment of the previously called 3’-end positions not captured by PIPETS showed that they were uniformly very low coverage.ConclusionsPIPETS provides a broadly applicable platform to compare 3’-end sequencing data sets across different organisms. It requires only the 3’-end sequencing data, and is broadly accessible to non-expert users.
Publisher
Cold Spring Harbor Laboratory