Author:
Tian Shulan,Jenkinson Garrett,Ferrer Alejandro,Yan Huihuang,Morales-Rosado Joel A.,Wang Kevin L.,Lasho Terra L.,Yan Benjamin B.,Baheti Saurabh,Olson Janet E.,Baughn Linda B.,Ding Wei,Slager Susan L.,Patnaik Mrinal S.,Lazaridis Konstantinos N.,Klee Eric W.
Abstract
ABSTRACTClonal hematopoiesis (CH) of indeterminate potential (CHIP), driven by somatic mutations in leukemia-associated genes, confers increased risk of hematologic malignancies, cardiovascular disease and all-cause mortality. In blood of healthy individuals, small CH clones can expand over time to reach 2% variant allele frequency (VAF), the current threshold for CHIP. Nevertheless, reliable detection of low-VAF CHIP mutations is challenging, often relying on deep targeted sequencing. Here, we present UNISOM, a streamlined workflow for CHIP detection from whole-genome and whole-exome sequencing data that are underpowered, especially for low VAFs. UNISOM utilizes a meta-caller for variant detection, in couple with machine learning models which classify variants into CHIP, germline and artifact. In whole-exome data, UNISOM recovered nearly 80% of the CHIP mutations identified via deep targeted sequencing in the same cohort. Applied to whole-genome data from Mayo Clinic Biobank, it recapitulated the patterns previously established in much larger cohorts, including the most frequently mutated CHIP genes, predominant mutation types and signatures, as well as strong associations of CHIP with age and smoking status. Notably, 30% of the identified CHIP mutations had <5% VAFs, demonstrating its high sensitivity toward small mutant clones. This workflow is applicable to CHIP screening in population genomic studies.
Publisher
Cold Spring Harbor Laboratory