PARE: A framework for removal of confounding effects from any distance-based dimension reduction method-Reference-Cited by-同舟云学术

PARE: A framework for removal of confounding effects from any distance-based dimension reduction method

Published:2024-07-10 Issue:7 Volume:20 Page:e1012241
ISSN:1553-7358
Container-title:PLOS Computational Biology
language:en
Short-container-title:PLoS Comput Biol

Author:

Chen Andrew A.^ORCID,Clark Kelly,Dewey Blake E.,DuVal Anna,Pellegrini Nicole,Nair Govind,Jalkh Youmna,Khalil Samar,Zurawski Jon,Calabresi Peter A.^ORCID,Reich Daniel S.^ORCID,Bakshi Rohit,Shou Haochang,Shinohara Russell T.,

Abstract

Dimension reduction tools preserving similarity and graph structure such as t-SNE and UMAP can capture complex biological patterns in high-dimensional data. However, these tools typically are not designed to separate effects of interest from unwanted effects due to confounders. We introduce the partial embedding (PARE) framework, which enables removal of confounders from any distance-based dimension reduction method. We then develop partial t-SNE and partial UMAP and apply these methods to genomic and neuroimaging data. For lower-dimensional visualization, our results show that the PARE framework can remove batch effects in single-cell sequencing data as well as separate clinical and technical variability in neuroimaging measures. We demonstrate that the PARE framework extends dimension reduction methods to highlight biological patterns of interest while effectively removing confounding effects.

Funder

National Institute of Neurological Disorders and Stroke

National Multiple Sclerosis Society

National Institute of Mental Health

University of Pennsylvania Center for Biomedical Image Computing and Analytics

Publisher

Public Library of Science (PLoS)

Reference43 articles.

1. Dimensionality Reduction for Visualizing Single-Cell Data Using UMAP;E Becht;Nature Biotechnology,2019

2. A Tool for Interactive Data Visualization: Application to Over 10,000 Brain Imaging and Phantom MRI Data Sets;SR Panta;Frontiers in Neuroinformatics,2016

3. Missing Data and Technical Variability in Single-Cell RNA-sequencing Experiments;SC Hicks;Biostatistics,2018

4. Simultaneous Dimension Reduction and Adjustment for Confounding Variation;Z Lin;Proceedings of the National Academy of Sciences,2016

5. aPCoA: Covariate Adjusted Principal Coordinates Analysis;Y Shi;Bioinformatics,2020