psupertime: supervised pseudotime analysis for time-series single-cell RNA-seq data

Author:

Macnair Will1,Gupta Revant2,Claassen Manfred23

Affiliation:

1. Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich 8093, Switzerland

2. Inner Medicine I, Faculty of Medicine, University of Tübingen, University Hospital Tübingen, 72074, Germany

3. Department of Computer Science, University of Tübingen , Tübingen 72074, Germany

Abstract

Abstract Motivation Improvements in single-cell RNA-seq technologies mean that studies measuring multiple experimental conditions, such as time series, have become more common. At present, few computational methods exist to infer time series-specific transcriptome changes, and such studies have therefore typically used unsupervised pseudotime methods. While these methods identify cell subpopulations and the transitions between them, they are not appropriate for identifying the genes that vary coherently along the time series. In addition, the orderings they estimate are based only on the major sources of variation in the data, which may not correspond to the processes related to the time labels. Results We introduce psupertime, a supervised pseudotime approach based on a regression model, which explicitly uses time-series labels as input. It identifies genes that vary coherently along a time series, in addition to pseudotime values for individual cells, and a classifier that can be used to estimate labels for new data with unknown or differing labels. We show that psupertime outperforms benchmark classifiers in terms of identifying time-varying genes and provides better individual cell orderings than popular unsupervised pseudotime techniques. psupertime is applicable to any single-cell RNA-seq dataset with sequential labels (e.g. principally time series but also drug dosage and disease progression), derived from either experimental design and provides a fast, interpretable tool for targeted identification of genes varying along with specific biological processes. Availability and implementation R package available at github.com/wmacnair/psupertime and code for results reproduction at github.com/wmacnair/psupplementary. Supplementary information Supplementary data are available at Bioinformatics online.

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Reference32 articles.

1. Gene set enrichment analysis with topGO;Alexa;Bioconductor Improv.,2009

2. L1 penalized continuation ratio models for ordinal response prediction using high-dimensional datasets;Archer;Stat. Med.,2012

3. Single-cell trajectory detection uncovers progression and regulatory coordination in human B cell development;Bendall;Cell,2014

4. Generalizing RNA velocity to transient cell states through dynamical modeling;Bergen;Nat. Biotechnol,2020

5. Integrating single-cell transcriptomic data across different conditions, technologies, and species;Butler;Nat. Biotechnol.,2018

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3