Abstract
AbstractCell-free DNA (cfDNA) has shown promise as a non-invasive biomarker for cancer screening and monitoring. The current advanced machine learning (ML) model, known as DNA evaluation of fragments for early interception (DELFI), utilizes the short and long fragmentation pattern of cfDNA and has demonstrated exceptional performance. However, the application of cfDNA-based model can be limited by the high cost of whole-genome sequencing (WGS). In this study, we present a novel ML model for cancer detection that utilizes cfDNA profiles generated from all protein-coding genes in the genome (exome) with only 0.08X of WGS coverage. Our model was trained on a dataset of 721 cfDNA profiles, comprising 426 cancer patients and 295 healthy individuals. Performance evaluation using a ten-fold cross-validation approach demonstrated that the new ML model using whole-exome regions, called xDELFI, can achieve high accuracy in cancer detection (Area under the ROC curve; AUC=0.896, 95%CI = 0.878 - 0.916), comparable to the model using WGS (AUC=0.920, 95%CI = 0.901 – 0.936). Notably, we observed distinct fragmentation patterns between exonic regions and the whole-genome, suggesting unique genomic features within exonic regions. Furthermore, we demonstrate the potential benefits of combining mutation detection in cfDNA with xDELFI, which enhance the model sensitivity. Our proof-of-principle study indicates that the fragmentomic ML model based solely on whole-exome regions retains its predictive capability. With the ultra-low sequencing coverage of the new model, it could potentially improve the accessibility of cfDNA-based cancer diagnosis and aid in early detection and treatment of cancer.
Publisher
Cold Spring Harbor Laboratory