Benchmarking of analytical combinations for COVID-19 outcome prediction using single-cell RNA sequencing data-Reference-Cited by-同舟云学术

Benchmarking of analytical combinations for COVID-19 outcome prediction using single-cell RNA sequencing data

Published:2023-01-18 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Cao Yue^ORCID,Ghazanfar Shila^ORCID,Yang Pengyi^ORCID,Yang Jean^ORCID

Abstract

AbstractThe advances of single-cell transcriptomic technologies have led to increasing use of single-cell RNA sequencing (scRNA-seq) data in large-scale patient cohort studies. The resulting high-dimensional data can be summarised and incorporated into patient outcome prediction models in several ways, however, there is a pressing need to understand the impact of analytical decisions on such model quality. In this study, we evaluate the impact of analytical choices on model choices, ensemble learning strategies and integration approaches on patient outcome prediction using five scRNA-seq COVID-19 datasets. First, we examine the difference in performance between using each single-view feature space versus multi-view feature space. Next, we survey multiple learning platforms from classical machine learning to modern deep learning methods. Lastly, we compare different integration approaches when combining datasets is necessary. Through benchmarking such analytical combinations, our study highlights the power of ensemble learning, consistency among different learning methods and robustness to dataset normalisation when using multiple datasets as the model input.Summary key points

This work assesses and compares the performance of three categories of workflow consisting of 350 analytical combinations for outcome prediction using multi-sample, multi-conditions single-cell studies.

We observed that using ensemble of feature types performs better than using individual feature type

We found that in the current data, all learning approaches including deep learning exhibit similar predictive performance. When combining multiple datasets as the input, our study found that integrating multiple datasets at the cell level performs similarly to simply concatenating the patient representation without modification.

Publisher

Cold Spring Harbor Laboratory

Reference24 articles.

1. Svensson V , da Veiga Beltrame E , Pachter L. A curated database reveals trends in single-cell transcriptomics. Database 2020; 2020:

2. Over 1000 tools reveal trends in the single-cell RNA-seq analysis landscape

3. BSDE: barycenter single-cell differential expression for case–control studies;Bioinformatics,2022

4. Feature selection revisited in the single-cell era;Genome Biol,2021

5. Single-cell RNA sequencing in cancer: Applications, advances, and emerging challenges