Abstract
AbstractSingle-Cell RNA sequencing (scRNA-seq) has provided unprecedented opportunities for exploring gene expression and thus uncovering regulatory relationships between genes at the single cell level. However, scRNA-seq relies on isolating cells from tissues. Thus, the spatial context of the regulatory processes is lost. A recent technological innovation, spatial transcriptomics, allows to measure gene expression while preserving spatial information. A first step in the spatial transcriptomic analysis is to identify the cell type which requires a careful selection of cell-specific marker genes. For this purpose, currently scRNA-seq data is used to select limited number of marker genes from among all genes that distinguish cell types from each other. This study proposes scMAGS (single-cell MArker Gene Selection), a new approach for marker gene selection from scRNA-seq data for spatial transcriptomics studies. scMAGS uses a filtering step in which the candidate genes are extracted prior to the marker gene selection step. For the selection of marker genes, cluster validity indices, Silhouette index or Calinski-Harabasz index (for large datasets) are utilized. Experimental results showed that, in comparison to the existing methods, scMAGS is scalable, fast and accurate. Even for the large datasets with millions of cells, scMAGS could find the required number of marker genes in a reasonable amount of time with less memory requirements.
Publisher
Cold Spring Harbor Laboratory