Abstract
AbstractPeptide vaccines present a safe and cost-efficient alternative to traditional vaccines. Their efficacy depends on the peptides included in the vaccine and the ability of major histocompatibility complex (MHC) molecules to bind and present these peptides. Due to the high diversity of MHC alleles, their diverging peptide binding specificities, and physical constraints on the maximum length of peptide vaccine constructs, choosing a set of peptides that effectively achieve immunization across a large proportion of the population is challenging.Here, we present HOGVAX, a combinatorial optimization approach to select peptides that maximize population coverage. The key idea behind HOGVAX is to exploit overlaps between peptide sequences to include a large number of peptides in limited space and thereby also cover rare MHC alleles. We formalize the vaccine design task as a theoretical problem, which we call the Maximum Scoring k-Superstring Problem (MSKS). We show that MSKS is NP-hard, reformulate it into a graph problem using the hierarchical overlap graph (HOG), and present a haplotype-aware variant of MSKS to take linkage disequilibrium between MHC loci into account. We give an integer linear programming formulation for the graph problem and provide an open source implementation.We demonstrate on a SARS-CoV-2 case study that HOGVAX-designed vaccine formulations contain significantly more peptides than vaccine sequences built from concatenated peptides. We predict over 98% population coverage and high numbers of per-individual presented peptides, leading to robust immunity against new pathogens or viral variants.
Publisher
Cold Spring Harbor Laboratory