Abstract
The integration of viruses into the human genome is known to be associated with tumorigenesis in many cancers, but the accurate detection of integration breakpoints from short read sequencing data is made difficult by human-viral homologies, viral genome heterogeneity, coverage limitations, and other factors. To address this, we present Exogene, a sensitive and efficient workflow for detecting viral integrations from paired-end next generation sequencing data. Exogene’s read filtering and breakpoint detection strategies yield integration coordinates that are highly concordant with long read validation. We demonstrate this concordance across 6 TCGA Hepatocellular carcinoma (HCC) tumor samples, identifying integrations of hepatitis B virus that are also supported by long reads. Additionally, we applied Exogene to targeted capture data from 426 previously studied HCC samples, achieving 98.9% concordance with existing methods and identifying 238 high-confidence integrations that were not previously reported. Exogene is applicable to multiple types of paired-end sequence data, including genome, exome, RNA-Seq and targeted capture.
Funder
Mayo Clinic Center for Individualized Medicine
Publisher
Public Library of Science (PLoS)
Reference31 articles.
1. Viruses and human cancers: a long road of discovery of molecular paradigms;MK White;Clinical microbiology reviews,2014
2. Infectious agents and cancer: criteria for a causal relation;JS Pagano;Seminars in cancer biology,2004
3. Antibodies to Epstein-Barr virus in Burkitt’s lymphoma and control groups;G Henle;Journal of the National Cancer Institute,1969
4. Detection of Epstein-Barr virus DNA in human tumors;M Nonoyama;Bibliotheca Haematologica,1975
5. Chromosomal integration sites of human papillomavirus DNA in three cervical cancer cell lines mapped by in situ hybridization;A Mincheva;Medical microbiology and immunology,1987
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献