The sum of two halves may be different from the whole. Effects of splitting sequencing samples across lanes-Reference-Cited by-同舟云学术

The sum of two halves may be different from the whole. Effects of splitting sequencing samples across lanes

Published:2021-05-11 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Williams Eleanor C.,Chazarra-Gil Ruben,Shahsavari Arash,Mohorianu Irina^ORCID

Abstract

AbstractThe advances in high throughput sequencing (HTS) enabled the characterisation of biological processes at an unprecedented level of detail; the majority of hypotheses in molecular biology rely on analyses of HTS data. However, achieving increased robustness and reproducibility of results remains one of the main challenges. Although variability in results may be introduced at various stages, e.g. alignment, summarisation or detection of differences in expression, one source of variability was systematically omitted: the sequencing design which propagates through analyses and may introduce an additional layer of technical variation.We illustrate qualitative and quantitative differences arising from splitting samples across lanes, on bulk and single-cell sequencing. For bulk mRNAseq data, we focus on differential expression and enrichment analyses; for bulk ChIPseq data, we investigate the effect on peak calling, and peaks’ properties. At single-cell level, we concentrate on identifying cell subpopulations. We rely on markers used for assigning cell identities; both smartSeq and 10x data are presented.The observed reduction in the number of unique sequenced fragments reduces the level of detail on which the different prediction approaches depend. Further, the sequencing stochasticity adds in a weighting bias corroborated with variable sequencing depths and (yet unexplained) sequencing bias.

Publisher

Cold Spring Harbor Laboratory

Reference52 articles.

1. Stark R , Grzelak M , and Hadfield J. RNA sequencing: the teenage years. Nature Reviews Genetics, 20, 07 2019.

2. A beginner's guide to eukaryotic genome annotation

3. Steward C , Parker A , Minassian B , et al. Genome annotation for clinical genomic diagnostics: Strengths and weaknesses. Genome Medicine, 9, 05 2017.

4. Salzberg S. Next-generation genome annotation: We still struggle to get it right. Genome Biology, 20, 12 2019.

5. Conesa A , Madrigal P , Tarazona S , et al. A survey of best practices for RNA-seq data analysis. Genome Biology, 17, 01 2016.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. ClustAssess: tools for assessing the robustness of single-cell clustering;2022-02-02