Abstract
ABSTRACTImmunoglobulins are highly diverse, diverging from their originating germline genes driven primarily by somatic recombination and hypermutation. However, somatic gene conversion is a strong driver of immunoglobulin diversity in some species, including rabbits and chickens. It is considerably harder to detect by sequence analysis than point mutations, and currently no dedicated tools exist for identifying these events. We present GECCO, the first dedicated gene conversion identification tool for immunoglobulins based on modified, simultaneous, pairwise alignments to host and donor references. We benchmark our approach on simulated repertoires and find GECCO has high recall, low false positive rate, and is insensitive to somatic mutations. We apply this new approach to characterize gene conversion events at the repertoire level in hyper-immunized rabbits, to show patterns of donor V gene preferences and donor tract length distributions. The dedicated gene conversion identification method we present allows for the characterization of a new feature of antibody repertoires that has not been possible thus far. GECCO will benefit future studies to explore the prevalence of immunoglobulin gene conversion in additional species.
Publisher
Cold Spring Harbor Laboratory