Abstract
Genetic variants in Epstein-Barr virus (EBV) have been strongly associated with nasopharyngeal carcinoma (NPC) in South China. However, different results regarding the most significant viral variants, with polymorphisms in EBER2 and BALF2 loci, have been reported in separate studies. In this study, we newly sequenced 100 EBV genomes derived from 61 NPC cases and 39 population controls. Comprehensive genomic analyses of EBV sequences from both NPC patients and healthy carriers in South China were conducted, totaling 279 cases and 227 controls. Meta-analysis of genome-wide association study revealed a 4-bp deletion downstream of EBER2 (coordinates, 7188–7191; EBER-del) as the most significant variant associated with NPC. Furthermore, multiple viral variants were found to be genetically linked to EBER-del forming a risk haplotype, suggesting that multiple viral variants might be associated with NPC pathogenesis. Population structure and phylogenetic analyses further characterized a high risk EBV lineage for NPC revealing a panel of 38 single nucleotide polymorphisms (SNPs), including those in the EBER2 and BALF2 loci. With linkage disequilibrium clumping and feature selection algorithm, the 38 SNPs could be narrowed down to 9 SNPs which can be used to accurately detect the high risk EBV lineage. In summary, our study provides novel insight into the role of EBV genetic variation in NPC pathogenesis by defining a risk haplotype of EBV for downstream functional studies and identifying a single high risk EBV lineage characterized by 9 SNPs for potential application in population screening of NPC.
Publisher
Public Library of Science (PLoS)