Joint multi-omics discriminant analysis with consistent representation learning using PANDA-Reference-Cited by-同舟云学术

Joint multi-omics discriminant analysis with consistent representation learning using PANDA

Published:2024-05-17 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Wu Jia¹^ORCID,Aminu Muhammad¹^ORCID,Hong Lingzhi¹^ORCID,Vokes Natalie¹^ORCID,Schmidt Stephanie²^ORCID,Saad Maliazurina B.¹,Zhu Bo¹,Li Xiuning¹^ORCID,Cascone Tina¹^ORCID,Sheshadri Ajay¹^ORCID,Jaffray David¹,Futreal Andrew¹^ORCID,Lee Jack²^ORCID,Byers Lauren¹^ORCID,Gibbons Don¹^ORCID,Heymach John²^ORCID,Chen Ken³^ORCID,Cheng Chao⁴^ORCID,Zhang Jianjun¹^ORCID,Wang Bo⁵

Affiliation:

1. The University of Texas MD Anderson Cancer Center

2. MD Anderson Cancer Center

3. UT MD Anderson

4. Baylor College of Medicine

5. University of Toronto

Abstract

Integrative multi-omics analysis provides deeper insight and enables better and more realistic modeling of the underlying biology and causes of diseases than does single omics analysis. Although several integrative multi-omics analysis methods have been proposed and demonstrated promising results in integrating distinct omics datasets, inconsistent distribution of the different omics data, which is caused by technology variations, poses a challenge for paired integrative multi-omics methods. In addition, the existing discriminant analysis–based integrative methods do not effectively exploit correlation and consistent discriminant structures, necessitating a compromise between correlation and discrimination in using these methods. Herein we present PAN-omics Discriminant Analysis (PANDA), a joint discriminant analysis method that seeks omics-specific discriminant common spaces by jointly learning consistent discriminant latent representations for each omics. PANDA jointly maximizes between-class and minimizes within-class omics variations in a common space and simultaneously models the relationships among omics at the consistency representation and cross-omics correlation levels, overcoming the need for compromise between discrimination and correlation as with the existing integrative multi-omics methods. Because of the consistency representation learning incorporated into the objective function of PANDA, this method seeks a common discriminant space to minimize the differences in distributions among omics, can lead to a more robust latent representations than other methods, and is against the inconsistency of the different omics. We compared PANDA to 10 other state-of-the-art multi-omics data integration methods using both simulated and real-world multi-omics datasets and found that PANDA consistently outperformed them while providing meaningful discriminant latent representations. PANDA is implemented using both R and MATLAB, with codes available at https://github.com/WuLabMDA/PANDA.

Publisher

Research Square Platform LLC

Reference96 articles.

1. Integrative clustering of multi-level ‘omic data based on non-negative matrix factorization algorithm;Chalise P;PLoS ONE,2017

2. Regularized generalized canonical correlation analysis;Tenenhaus A;Psychometrika,2011

3. Identifying temporal and spatial patterns of variation from multimodal data using MEFISTO;Velten B;Nat Methods,2022

4. DIABLO: an integrative approach for identifying key molecular drivers from multi-omics assays;Singh A;Bioinformatics,2019

5. MOGONET integrates multi-omics data using graph convolutional networks allowing patient classification and biomarker identification;Wang T;Nat Commun,2021