Abstract
AbstractStudying mutation in healthy somatic tissues is key for understanding the genesis of cancer and other genetic diseases. Mutation rate varies from site to site in the human genome by up to 100-fold and is influenced by numerous epigenetic and genetic factors including GC content, trinucleotide sequence context, DNAse accessibility, and histone modifications. These factors influence mutation at both local and regional scales and are often interrelated with one another, meaning that predicting mutability or uncovering its drivers requires modelling multiple factors and scales simultaneously. Historically, most investigations have focused either on analyzing the local sequence scale through triplet signatures or on examining the impact of epigenetic processes at larger scales, but not both concurrently. Additionally, sequencing technology limitations have restricted analyses to mutations only in coding regions (RNA-seq) or to those that have been influenced by selection (bulk tissue). Here we present a comprehensive analysis of epigenetic and genetic factors at multiple scales in the germline and three somatic tissues. We leverage publicly available data for 21 genomic predictors, and somatic mutations from single cell whole genome sequencing. We create models that accurately predict mutability in each tissue, and compare how the genomic predictors of mutability vary across the human body. Our analysis reveals that triplets emerge as robust predictors of mutability in comparison to epigenetic factors. Importantly, we observe both universal and tissue-specific mutagenic processes in healthy tissues, with implications for understanding the maintenance of germline versus soma and the mechanisms underlying early tumorigenesis.
Publisher
Cold Spring Harbor Laboratory