Removing unwanted variation from large-scale RNA sequencing data with PRPS-Reference-Cited by-同舟云学术

Removing unwanted variation from large-scale RNA sequencing data with PRPS

Published:2022-09-15 Issue:1 Volume:41 Page:82-95
ISSN:1087-0156
Container-title:Nature Biotechnology
language:en
Short-container-title:Nat Biotechnol

Author:

Molania Ramyar,Foroutan Momeneh,Gagnon-Bartsch Johann A.,Gandolfo Luke C.^ORCID,Jain Aryan^ORCID,Sinha Abhishek^ORCID,Olshansky Gavriel,Dobrovic Alexander,Papenfuss Anthony T.^ORCID,Speed Terence P.^ORCID

Abstract

AbstractAccurate identification and effective removal of unwanted variation is essential to derive meaningful biological results from RNA sequencing (RNA-seq) data, especially when the data come from large and complex studies. Using RNA-seq data from The Cancer Genome Atlas (TCGA), we examined several sources of unwanted variation and demonstrate here how these can significantly compromise various downstream analyses, including cancer subtype identification, association between gene expression and survival outcomes and gene co-expression analysis. We propose a strategy, called pseudo-replicates of pseudo-samples (PRPS), for deploying our recently developed normalization method, called removing unwanted variation III (RUV-III), to remove the variation caused by library size, tumor purity and batch effects in TCGA RNA-seq data. We illustrate the value of our approach by comparing it to the standard TCGA normalizations on several TCGA RNA-seq datasets. RUV-III with PRPS can be used to integrate and normalize other large transcriptomic datasets coming from multiple laboratories or platforms.

Funder

Ovarian Cancer Research Foundation

Prostate Cancer Foundation

National Breast Cancer Foundation

Department of Health | National Health and Medical Research Council

Lorenzo and Pamela Galli Medical Research Trust

Publisher

Springer Science and Business Media LLC

Subject

Biomedical Engineering,Molecular Medicine,Applied Microbiology and Biotechnology,Bioengineering,Biotechnology

Link

https://www.nature.com/articles/s41587-022-01440-w.pdf