A cell marker‐based clustering strategy (cmCluster) for precise cell type identification of scRNA‐seq data-Reference-Cited by-同舟云学术

A cell marker‐based clustering strategy (cmCluster) for precise cell type identification of scRNA‐seq data

Published:2023-06 Issue:2 Volume:11 Page:163-174
ISSN:2095-4689
Container-title:Quantitative Biology
language:en
Short-container-title:Quant. Biol.

Author:

Huang Yuwei¹,Chang Huidan¹,Chen Xiaoyi²,Meng Jiayue¹,Han Mengyao¹,Huang Tao¹,Yuan Liyun¹,Zhang Guoqing¹

Affiliation:

1. CAS Key Laboratory of Computational Biology Bio‐Med Big Data Center Shanghai Institute of Nutrition and Health University of Chinese Academy of Sciences Chinese Academy of Science Shanghai 200031 China

2. Ningbo Institute of Life and Health Industry University of Chinese Academy of Sciences Ningbo 315000 China

Abstract

BackgroundThe precise and efficient analysis of single‐cell transcriptome data provides powerful support for studying the diversity of cell functions at the single‐cell level. The most important and challenging steps are cell clustering and recognition of cell populations. While the precision of clustering and annotation are considered separately in most current studies, it is worth attempting to develop an extensive and flexible strategy to balance clustering accuracy and biological explanation comprehensively.MethodsThe cell marker‐based clustering strategy (cmCluster), which is a modified Louvain clustering method, aims to search the optimal clusters through genetic algorithm (GA) and grid search based on the cell type annotation results.ResultsBy applying cmCluster on a set of single‐cell transcriptome data, the results showed that it was beneficial for the recognition of cell populations and explanation of biological function even on the occasion of incomplete cell type information or multiple data resources. In addition, cmCluster also produced clear boundaries and appropriate subtypes with potential marker genes. The relevant code is available in GitHub website (huangyuwei301/cmCluster).ConclusionsWe speculate that cmCluster provides researchers effective screening strategies to improve the accuracy of subsequent biological analysis, reduce artificial bias, and facilitate the comparison and analysis of multiple studies.

Publisher

Wiley

Subject

Applied Mathematics,Computer Science Applications,Biochemistry, Genetics and Molecular Biology (miscellaneous),Modeling and Simulation

Link

https://onlinelibrary.wiley.com/doi/pdf/10.15302/J-QB-022-0311

Reference43 articles.

1. Single-cell transcriptional diversity is a hallmark of developmental potential

2. mRNA-Seq whole-transcriptome analysis of a single cell

3. Current best practices in single‐cell RNA‐seq analysis: a tutorial

4. Power analysis of single-cell RNA-sequencing experiments

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Combining Global-Constrained Concept Factorization and a Regularized Gaussian Graphical Model for Clustering Single-Cell RNA-seq Data;Interdisciplinary Sciences: Computational Life Sciences;2023-10-10