Author:
Zhang ,Luo ,Zhong ,Choi ,Ma ,Wang ,Mahrt ,Guo ,Stawiski ,Modrusan ,Seshagiri ,Kapur ,Hon ,Brugarolas ,Wang
Abstract
Advances in single-cell RNA sequencing (scRNA-Seq) have allowed for comprehensive analyses of single cell data. However, current analyses of scRNA-Seq data usually start from unsupervised clustering or visualization. These methods ignore the prior knowledge of transcriptomes and of the probable structures of the data. Moreover, cell identification heavily relies on subjective and inaccurate human inspection afterwards. To address these analytical challenges, we developed the Semi-supervised Category Identification and Assignment (SCINA) algorithm, a semi-supervised model, for analyses of scRNA-Seq and flow cytometry/CyTOF data, and other data of similar format, by automatically exploiting previously established gene signatures using an expectation–maximization (EM) algorithm. We applied SCINA on a wide range of datasets, and showed its accuracy, stableness and efficiency exceeded most popular unsupervised approaches. SCINA discovered an intermediate stage of oligodendrocyte from mouse brain scRNA-Seq data. SCINA also detected immune cell population shifting in Stk4 knock-out -knockoutmouse cytometry data. Finally, SCINA identified a new kidney tumor clade with similarity to FH-deficient tumors from bulk tumor data. Overall, SCINA provides both methodological advances and biological insights from perspectives different from traditional analytical methods.
Funder
National Institutes of Health
University of Texas Southwestern Medical Center
Cancer Prevention and Research Institute of Texas
Subject
Genetics(clinical),Genetics
Cited by
177 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献