AGImpute: imputation of scRNA-seq data based on a hybrid GAN with dropouts identification

Author:

Zhu Xiaoshu1ORCID,Meng Shuang2,Li Gaoshi2,Wang Jianxin3ORCID,Peng Xiaoqing4ORCID

Affiliation:

1. School of Computer and Information Security, Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology , Guilin 541004, China

2. School of Computer Science and Engineering, Guangxi Normal University , Guilin 541006, China

3. School of Computer Science and Engineering, Hunan Provincial Key Lab on Bioinformatics, Central South University , Changsha 400083, China

4. School of Life Sciences, Center for Medical Genetics, Central South University , Changsha 400083, China

Abstract

Abstract Motivation Dropout events bring challenges in analyzing single-cell RNA sequencing data as they introduce noise and distort the true distributions of gene expression profiles. Recent studies focus on estimating dropout probability and imputing dropout events by leveraging information from similar cells or genes. However, the number of dropout events differs in different cells, due to the complex factors, such as different sequencing protocols, cell types, and batch effects. The dropout event differences are not fully considered in assessing the similarities between cells and genes, which compromises the reliability of downstream analysis. Results This work proposes a hybrid Generative Adversarial Network with dropouts identification to impute single-cell RNA sequencing data, named AGImpute. First, the numbers of dropout events in different cells in scRNA-seq data are differentially estimated by using a dynamic threshold estimation strategy. Next, the identified dropout events are imputed by a hybrid deep learning model, combining Autoencoder with a Generative Adversarial Network. To validate the efficiency of the AGImpute, it is compared with seven state-of-the-art dropout imputation methods on two simulated datasets and seven real single-cell RNA sequencing datasets. The results show that AGImpute imputes the least number of dropout events than other methods. Moreover, AGImpute enhances the performance of downstream analysis, including clustering performance, identifying cell-specific marker genes, and inferring trajectory in the time-course dataset. Availability and implementation The source code can be obtained from https://github.com/xszhu-lab/AGImpute.

Funder

National Natural Science Foundation of China

Publisher

Oxford University Press (OUP)

Reference31 articles.

1. Psychrophilic proteases dramatically reduce single-cell RNA-seq artifacts: a molecular atlas of kidney development;Adam;Development,2017

2. Single-cell transcriptomic heterogeneity between conduit and resistance mesenteric arteries in rats;Anderson;Physiol Genomics,2023

3. Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells;Buettner;Nat Biotechnol,2015

4. Single-cell transcriptome data clustering via multinomial modeling and adaptive fuzzy k-means algorithm;Chen;Front Genet,2020

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3