Affiliation:
1. Department of Biostatistics St. Jude Children's Research Hospital Memphis Tennessee USA
Abstract
AbstractIdentification of biomarkers by integrating multiple omics together is important because complex diseases occur due to an intricate interplay of various genetic materials. Traditional single‐omics association tests neither explore this crucial interomics dependence nor identify moderately weak signals due to the multiple‐testing burden. Conversely, multiomics data integration imparts complementary information but suffers from an increased multiple‐testing burden, data diversity inherent with different omics features, high‐dimensionality, and so forth. Most of the available methods address subtype classification using dimension‐reduction techniques to circumvent the sample size issue but interacting multiomics biomarker identification methods are unavailable. We propose a two‐step model that first investigates phenotype‐omics association using logistic regression. Then, selects disease‐associated omics using sparse principal components which explores the interrelationship of multiple variables from two omics in a multivariate multiple regression framework. On the basis of this model, we developed a multiomics biomarker identification algorithm, interacting omics search (ioSearch), that jointly tests the effect of multiple omics with disease and between‐omics associations by using pathway information that subsequently reduces the multiple‐testing burden. Further, inference in terms of p values potentially makes it an easily interpretable biomarker identification tool. Extensive simulation demonstrates ioSearch as statistically powerful with a controlled Type‐I error rate. Its application to publicly available breast cancer data sets identified relevant omics features in important pathways.
Subject
Genetics (clinical),Epidemiology