mEthAE: an Explainable AutoEncoder for methylation data
Author:
Katz SonjaORCID, Martins dos Santos Vitor A.P.ORCID, Saccenti EdoardoORCID, Roshchupkin Gennady V.
Abstract
1AbstractDespite the wealth of knowledge generated through epigenome-wide association studies our under-standing of the relationships of CpG sites is still limited, as analysis of DNA methylation data remains difficult due its high dimensionality. To combat this problem, deep learning algorithms, such as autoencoders, are increasingly applied to capture the complex patterns and reduce dimensionality into latent space. We believe that the way an autoencoder groups together CpGs in its latent dimensions has biological meaning and might reveal novel insights regarding the relationship of CpGs. Therefore, in this work, we propose a chromosome-wise autoencoder for interpretable dimensionality reduction of methylation data (mEthAE). Our framework shows an impressive reduction in dimensions of up to 400-fold compared to the provided input, without compromising on reconstruction accuracy or predictive power in the latent space. Through our perturbation-based interpretability approach we revealed groups of CpGs which are highly connected across all latent dimensions (global CpGs) and were significantly more often reported in EWAS studies, indicating our interpretability method can successfully identify CpGs with biological relevance. In an attempt to gain a deeper understanding of the relationship between individual CpG sites, we focused on interpreting individual latent features and found that CpGs connected to a common feature do not share biological associations, correlation patterns, or are located in close proximity on the chromosome. We conclude that while there is evidence that the autoencoder does not group CpGs randomly, the logic behind the observed CpG relationships can not be delineated easily. With regards to the analyses done in this work, we believe that the autoencoder groups CpGs according to long range non-linear interaction patterns that lack characterisation in the current epigenetic research landscape.
Publisher
Cold Spring Harbor Laboratory
Reference58 articles.
1. Zachary D. Smith and Alexander Meissner. DNA methylation: Roles in mammalian development. 14(3):204–220. 2. Elizabeth M. Martin and Rebecca C. Fry. Environmental Influences on the Epigenome: Exposure-Associated DNA Methylation in Human Populations. 39(1):309–333. 3. Lotte C. Houtepen , Christiaan H. Vinkers , Tania Carrillo-Roa , Marieke Hiemstra , Pol A. van Lier , Wim Meeus , Susan Branje , Christine M. Heim , Charles B. Nemeroff , Jonathan Mill , Leonard C. Schalkwyk , Menno P. Creyghton René S. Kahn Marian Joëls , Elisabeth B. Binder , and Marco P. M. Boks . Genome-wide DNA methylation levels and altered cortisol stress reactivity following childhood trauma in humans. 7(1):10967. 4. Roby Joehanes , Allan C. Just , Riccardo E. Marioni , Luke C. Pilling , Lindsay M. Reynolds , Pooja R. Mandaviya , Weihua Guan , Tao Xu , Cathy E. Elks , Stella Aslibekyan , Hortensia Moreno-Macias , Jennifer A. Smith , Jennifer A. Brody , Radhika Dhingra , Paul Yousefi , James S. Pankow , Sonja Kunze , Sonia H. Shah , Allan F. McRae , Kurt Lohman , Jin Sha , Devin M. Absher , Luigi Ferrucci , Wei Zhao , Ellen W. Demerath , Jan Bressler , Megan L. Grove , Tianxiao Huan , Chunyu Liu , Michael M. Mendelson , Chen Yao , Douglas P. Kiel , Annette Peters , Rui Wang-Sattler , Pe-ter M. Visscher , Naomi R. Wray , John M. Starr , Jingzhong Ding , Carlos J. Rodriguez , Nicholas J. Wareham , Marguerite R. Irvin , Degui Zhi , Myrto Barrdahl , Paolo Vineis , Srikant Ambatipudi André G. Uitterlinden , Albert Hofman , Joel Schwartz , Elena Colicino , Lifang Hou , Pantel S. Vokonas , Dena G. Hernandez , Andrew B. Singleton , Stefania Bandinelli , Stephen T. Turner , Erin B. Ware , Alicia K. Smith , Torsten Klengel , Elisabeth B. Binder , Bruce M. Psaty , Kent D. Taylor , Sina A. Gharib , Brenton R. Swenson , Liming Liang , Dawn L. DeMeo George T. O’Connor , Zdenko Herceg , Kerry J. Ressler , Karen N. Conneely , Nona Sotoodehnia , Sharon L. R. Kardia , David Melzer , Andrea A. Baccarelli , Joyce B. J. van Meurs , Isabelle Romieu , Donna K. Arnett , Ken K. Ong , Yongmei Liu , Melanie Waldenberger , Ian J. Deary , Myriam Fornage , Daniel Levy , and Stephanie J. London . Epigenetic Signatures of Cigarette Smoking. 9(5):436–447. 5. Silvana C. E. Maas , Athina Vidaki , Rory Wilson , Alexander Teumer , Fan Liu , Joyce B. J. van Meurs André G. Uitterlinden , Dorret I. Boomsma , Eco J. C. de Geus , Gonneke Willemsen , Jenny van Dongen , Carla J. H. van der Kallen , P. Eline Slagboom , Marian Beekman , Diana van Heemst , Leonard H. van den Berg , BIOS Consortium , Liesbeth Duijts , Vincent W. V. Jaddoe , Karl-Heinz Ladwig , Sonja Kunze , Annette Peters , M. Arfan Ikram , Hans J. Grabe , Janine F. Felix , Melanie Waldenberger , Oscar H. Franco , Mohsen Ghanbari , and Manfred Kayser . Validated inference of smoking habits from blood with a finite DNA methylation marker set. 34(11):1055–1074.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|