Exploring High-Dimensional Biological Data with Sparse Contrastive Principal Component Analysis-Reference-Cited by-同舟云学术

Exploring High-Dimensional Biological Data with Sparse Contrastive Principal Component Analysis

Published:2019-11-09 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Boileau Philippe^ORCID,Hejazi Nima S.^ORCID,Dudoit Sandrine^ORCID

Abstract

AbstractMotivationStatistical analyses of high-throughput sequencing data have re-shaped the biological sciences. In spite of myriad advances, recovering interpretable biological signal from data corrupted by technical noise remains a prevalent open problem. Several classes of procedures, among them classical dimensionality reduction techniques and others incorporating subject-matter knowledge, have provided effective advances; however, no procedure currently satisfies the dual objectives of recovering stable and relevant features simultaneously.ResultsInspired by recent proposals for making use of control data in the removal of unwanted variation, we propose a variant of principal component analysis, sparse contrastive principal component analysis, that extracts sparse, stable, interpretable, and relevant biological signal. The new methodology is compared to competing dimensionality reduction approaches through a simulation study as well as via analyses of several publicly available protein expression, microarray gene expression, and single-cell transcriptome sequencing datasets.AvailabilityA free and open-source software implementation of the methodology, the scPCA R package, is made available via the Bioconductor Project. Code for all analyses presented in the paper is also available via GitHub.

Publisher

Cold Spring Harbor Laboratory

Reference41 articles.

1. Mitochondrial p32 Protein Is a Critical Regulator of Tumor Metabolism via Maintenance of Oxidative Phosphorylation

2. Exploring patterns enriched in a dataset with contrastive principal component analysis;Nature Communications,2018

3. viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia

4. Dimensionality reduction for visualizing single-cell data using UMAP;Nature Biotechnology,2019

5. Cardozo, D. M. , Moliterno, R. A. , Sell, A. M. , Guelsin, G. A. S. , Beltrame, L. M. , Clementino, S. L. , Reis, P. G. , Alves, H. V. , Mazini, P. S. , and Visentainer, J. E. L. (2014). Evidence of HLA-DQB1 contribution to susceptibility of dengue serotype 3 in dengue patients in Southern Brazil. Journal of Tropical Medicine, 2014.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. scPCA: A toolbox for sparse contrastive principal component analysis in R;Journal of Open Source Software;2020-02-25