Author:
Mcgreevy Kristen M,Chen Brian H,Horvath Steve,Telesca Donatello
Abstract
DNA methylation (DNAm) is an epigenetic mechanism vital for regulating gene expression and influencing disease states. Developing accurate DNAm biomarkers often requires data from specific tissues, which are sometimes difficult to access. This study explores the use of Transfer Learning (TL) to predict blood DNAm biomarkers using saliva DNAm data, aiming to overcome limitations posed by sample size and tissue accessibility. We developed TL-based algorithms that integrate DNAm data from multiple tissues. These algorithms were evaluated against traditional Lasso regression and direct saliva DNAm estimates. Our results show that TL significantly improves the prediction accuracy of DNAm biomarkers, outperforming traditional methods in 20 out of 26 biomarkers. We further validated our models using independent datasets, demonstrating that TL-derived predictions reflect known biological relationships, such as sex differences in telomere length and the impact of smoking on DNAm biomarkers. Our findings highlight the potential of TL in enhancing DNAm biomarker prediction across tissues, providing a valuable tool for epigenetic research. The developed algorithms and methodologies are accessible to researchers, fostering advancements in personalized medicine and aging research. This study establishes a framework for utilizing TL to bridge the gap between accessible and pertinent tissue data, paving the way for more accurate and versatile DNAm biomarker applications.ACM Reference FormatKristen M McGreevy, Brian H Chen, Steve Horvath, and Donatello Telesca. 2024. Cross Tissue DNAm Biomarker Prediction using Transfer Learning. 1, 1 (June 2024), 43 pages.https://doi.org/10.1145/nnnnnnn.nnnnnnn
Publisher
Cold Spring Harbor Laboratory