Abstract
AbstractSingle-cell RNA sequencing (scRNA-seq) data has been widely used for cell trajectory inference, with the assumption that cells with similar expression profiles share the same differentiation state. However, the inferred trajectory may not reflect true clonal relationships among cells. Single-cell T cell receptor sequencing (scTCR-seq) data provides invaluable insights into the clonal relationship among cells, yet it lacks functional characteristics. Therefore, scRNA-seq and scTCR-seq data complement each other in improving trajectory inference, where a reliable computational tool is still missing. We developed LRT, a computational framework for the integrative analysis of scTCR-seq and scRNA-seq data for T cell trajectory inference. Specifically, LRT utilizes the TCR sequence information to identify clonally related cells and then uses the transcriptomics information from scRNA-seq data to construct clonotype-level cell trajectories. LRT provides a comprehensive analysis workflow, including preprocessing, cell trajectory clustering, pseudotime inference, and marker gene identification. We illustrated its utility using scRNA-seq and scTCR-seq data of CD4+T cells with acute lymphocytic choriomeningitis virus infection, where we could identify cell trajectories that cannot be revealed solely based on scRNA-seq data. Our downstream analyses showed that (i) these trajectories are involved in distinct functional roles; (ii) the expression patterns of their marker genes over the estimated pseudotime nicely coincide with the Th1/Tfh biology that is well established for the CD4+T cell differentiation; and (iii) the higher level of TCR sequence similarities was observed within each cluster, compared to between clusters. The LRT framework was implemented as an R package ‘LRT’, and it is now publicly accessible athttps://github.com/JuanXie19/LRT. In addition, it provides two Shiny apps ‘shinyClone’ and ‘shinyClust’ that allow users to interactively explore distributions of clonotypes, conduct repertoire analysis, implement clustering of cell trajectories, and predict cell trajectory cluster marker genes.Author SummaryUnderstanding the dynamic changes behind biological processes is important for determining molecular mechanisms underlying normal tissue formulation, developmental disorders and pathologies. Usually, a biological process can be characterized by identifying a trajectory, a path that goes through the various cellular states associated with the process. Since cells in different states may express different sets of genes, researchers often infer cell trajectory via capturing transcriptomics changes. Dozens of methods have been developed for cell trajectory inference, and scRNA-seq data is predominantly utilized. However, methods based only on scRNA-seq data cannot tell us if cells from the same trajectory come from the same clone or not. T cells play a key role in the immune system, and their high antigen recognition specificity is largely determined by their TCR sequences. Thanks to the advent of scTCR-seq technology, people can identify the group of cells coming from the same clone. This paper describes our novel computational framework, namely LRT, and demonstrates that by complementing scRNA-seq data with the clonal information from scTCR-seq data using LRT, we are able to identify cell trajectories that cannot be revealed solely based on scRNA-seq data.
Publisher
Cold Spring Harbor Laboratory