Fast analysis of scATAC-seq data using a predefined set of genomic regions-Reference-Cited by-同舟云学术

Fast analysis of scATAC-seq data using a predefined set of genomic regions

Published:2020-05-28 Issue: Volume:9 Page:199
ISSN:2046-1402
Container-title:F1000Research
language:en
Short-container-title:F1000Res

Author:

Giansanti Valentina^ORCID,Tang Ming,Cittaro Davide^ORCID

Abstract

Background: Analysis of scATAC-seq data has been recently scaled to thousands of cells. While processing of other types of single cell data was boosted by the implementation of alignment-free techniques, pipelines available to process scATAC-seq data still require large computational resources. We propose here an approach based on pseudoalignment, which reduces the execution times and hardware needs at little cost for precision. Methods: Public data for 10k PBMC were downloaded from 10x Genomics web site. Reads were aligned to various references derived from DNase I Hypersensitive Sites (DHS) using kallisto and quantified with bustools. We compared our results with the ones publicly available derived by cellranger-atac. We subsequently tested our approach on scATAC-seq data for K562 cell line. Results: We found that kallisto does not introduce biases in quantification of known peaks; cells groups identified are consistent with the ones identified from standard method. We also found that cell identification is robust when analysis is performed using DHS-derived reference in place of de novo identification of ATAC peaks. Lastly, we found that our approach is suitable for reliable quantification of gene activity based on scATAC-seq signal, thus allows for efficient labelling of cell groups based on marker genes. Conclusions: Analysis of scATAC-seq data by means of kallisto produces results in line with standard pipelines while being considerably faster; using a set of known DHS sites as reference does not affect the ability to characterize the cell populations.

Funder

Associazione Italiana per la Ricerca sul Cancro

National Institutes of Health

Cancer Research UK

Publisher

F1000 Research Ltd

Subject

General Pharmacology, Toxicology and Pharmaceutics,General Immunology and Microbiology,General Biochemistry, Genetics and Molecular Biology,General Medicine

Link

https://f1000research.com/articles/9-199/v2/pdf

Reference36 articles.

1. Exponential scaling of single-cell RNA-seq in the past decade.;V Svensson;Nat Protoc.,2018

2. SCANPY: large-scale single-cell gene expression data analysis.;F Wolf;Genome Biol.,2018

3. STAR: ultrafast universal RNA-seq aligner.;A Dobin;Bioinformatics.,2013

4. Alignment-free sequence comparison: benefits, applications, and tools.;A Zielezinski;Genome Biol.,2017

5. RNA sequencing data: hitchhiker’s guide to expression analysis.;K Van den Berge;Annu Rev Biomed Data Sci.,2019

Cited by 13 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. SCInter: A comprehensive single-cell transcriptome integration database for human and mouse;Computational and Structural Biotechnology Journal;2024-12

2. Disruption of maternal vascular remodeling by a fetal endoretrovirus-derived gene in preeclampsia;Genome Biology;2024-05-07

3. Integrative Single Cell Atlas Revealed Intratumoral Heterogeneity Generation from an Adaptive Epigenetic Cell State in Human Bladder Urothelial Carcinoma;Advanced Science;2024-04-06

4. Tensor decomposition discriminates tissues using scATAC-seq;Biochimica et Biophysica Acta (BBA) - General Subjects;2023-06

5. Fundamental and practical approaches for single-cell ATAC-seq analysis;aBIOTECH;2022-09-27