Affiliation:
1. Research Unit of Mathematical Sciences, Biocenter Oulu, University of Oulu, 90014, Finland and
2. Infotech Oulu, University of Oulu, 90014, Finland
Abstract
Abstract
The Gaussian process (GP) regression is theoretically capable of capturing higher-order gene-by-gene interactions important to trait variation non-exhaustively with high accuracy. Unfortunately, GP approach is scalable only for 100-200 genes and thus, not applicable for high...
Gaussian process (GP)-based automatic relevance determination (ARD) is known to be an efficient technique for identifying determinants of gene-by-gene interactions important to trait variation. However, the estimation of GP models is feasible only for low-dimensional datasets (∼200 variables), which severely limits application of the GP-based ARD method for high-throughput sequencing data. In this paper, we provide a nonparametric prescreening method that preserves virtually all the major benefits of the GP-based ARD method and extends its scalability to the typical high-dimensional datasets used in practice. In several simulated test scenarios, the proposed method compared favorably with existing nonparametric dimension reduction/prescreening methods suitable for higher-order interaction searches. As a real-data example, the proposed method was applied to a high-throughput dataset downloaded from the cancer genome atlas (TCGA) with measured expression levels of 16,976 genes (after preprocessing) from patients diagnosed with acute myeloid leukemia.
Publisher
Oxford University Press (OUP)
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献