Vulture: cloud-enabled scalable mining of microbial reads in public scRNA-seq data-Reference-Cited by-同舟云学术

Vulture: cloud-enabled scalable mining of microbial reads in public scRNA-seq data

Published:2024 Issue: Volume:13 Page:
ISSN:2047-217X
Container-title:GigaScience
language:en
Short-container-title:

Author:

Chen Junyi¹²^ORCID,Yin Danqing¹²^ORCID,Wong Harris Y H¹^ORCID,Duan Xin¹^ORCID,Yu Ken H O¹²,Ho Joshua W K¹²^ORCID

Affiliation:

1. Laboratory of Data Discovery for Health Limited (D24H), Hong Kong Science Park , Hong Kong SAR , China

2. School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong , Pokfulam, Hong Kong SAR , China

Abstract

Abstract The rapidly growing collection of public single-cell sequencing data has become a valuable resource for molecular, cellular, and microbial discovery. Previous studies mostly overlooked detecting pathogens in human single-cell sequencing data. Moreover, existing bioinformatics tools lack the scalability to deal with big public data. We introduce Vulture, a scalable cloud-based pipeline that performs microbial calling for single-cell RNA sequencing (scRNA-seq) data, enabling meta-analysis of host–microbial studies from the public domain. In our benchmarking experiments, Vulture is 66% to 88% faster than local tools (PathogenTrack and Venus) and 41% faster than the state-of-the-art cloud-based tool Cumulus, while achieving comparable microbial read identification. In terms of the cost on cloud computing systems, Vulture also shows a cost reduction of 83% ($12 vs. ${\$}$70). We applied Vulture to 2 coronavirus disease 2019, 3 hepatocellular carcinoma (HCC), and 2 gastric cancer human patient cohorts with public sequencing reads data from scRNA-seq experiments and discovered cell type–specific enrichment of severe acute respiratory syndrome coronavirus 2, hepatitis B virus (HBV), and Helicobacter pylori–positive cells, respectively. In the HCC analysis, all cohorts showed hepatocyte-only enrichment of HBV, with cell subtype-associated HBV enrichment based on inferred copy number variations. In summary, Vulture presents a scalable and economical framework to mine unknown host–microbial interactions from large-scale public scRNA-seq data. Vulture is available via an open-source license at https://github.com/holab-hku/Vulture.

Funder

Innovation and Technology Commission - Hong Kong

Publisher

Oxford University Press (OUP)

Link

https://academic.oup.com/gigascience/article-pdf/doi/10.1093/gigascience/giad117/55271471/giad117.pdf

Reference45 articles.

1. Mechanisms of HBV-induced hepatocellular carcinoma;Levrero;J Hepatol,2016

2. Helicobacter pylori and gastric cancer: factors that modulate disease risk;Wroblewski;Clin Microbiol Rev,2010

3. Single-cell immunology of SARS-CoV-2 infection;Tian;Nat Biotechnol,2022

4. HSV-1 single-cell analysis reveals the activation of anti-viral and developmental programs in distinct sub-populations;Drayman;eLife,2019

5. Defining the transcriptional landscape during Cytomegalovirus latency with single-cell RNA sequencing;Shnayder;mBio,2018