Abstract
AbstractRecent advances in measurement technologies, particularly single-cell RNA sequencing (scRNA-seq), have revolutionized our ability to acquire large amounts of omics-level data on cellular states. As measurement techniques evolve, there has been an increasing need for data analysis methodologies, especially those focused on cell-type identification and inference of gene regulatory networks (GRNs). We have developed a new method named BootCellNet, which employs smoothing and resampling to infer GRNs. Using the inferred GRNs, BootCellNet further infers the minimum dominating set (MDS), a set of genes that determines the dynamics of the entire network. We have demonstrated that BootCellNet robustly infers GRNs and their MDSs from scRNA-seq data and facilitates unsupervised identification of cell clusters using scRNA-seq datasets of peripheral blood mononuclear cells and hematopoiesis. It has also identified COVID-19 patient-specific cells and their potential regulatory transcription factors. BootCellNet not only identifies cell types in an unsupervised and explainable way but also provides insights into the characteristics of identified cell types through the inference of GRNs and MDS.Author SummarySingle-cell omics technologies, such as single-cell RNA-seq (scRNA-seq), are instrumental in identifying novel cell subsets that are involved in various biological processes and diseases. These technologies, however, require further development in data analysis, especially in areas focused on cell-type identification and inference of interactions between genes. The problem of cell-type identification essentially involves clustering, which necessitates a balance between distinguishing different cell states and grouping similar ones together. Current clustering methods still suffer from uncertainty in determining the appropriate number of clusters and in explaining why some cells are clustered together but others are separated. The inference of interactions between genes, gene regulatory network (GRN), remains challenging due to the noisy nature of scRNA-seq. We have developed BootCellNet, a method that infers GRNs and a set of genes dominating the network dynamics and utilizes the set to cluster cells and identify cell types. The method addresses challenges in GRN identification and clustering methods simultaneously and will facilitate the generation of the working hypotheses from a large amount of scRNA-seq data.
Publisher
Cold Spring Harbor Laboratory