Abstract
AbstractAnalysis of an individual’s immunoglobulin (IG) gene repertoire requires the use of high-quality germline gene Reference Sets. The Adaptive Immune Receptor Repertoire-Community (AIRR-C) Reference Sets have been developed to include only human IG heavy and light chain alleles that have been confirmed by evidence from multiple high-quality sources. By including only those alleles with a high level of support, including some new sequences that currently lack official names, AIRR-seq analysis will have greater accuracy and studies of the evolution of immunoglobulin genes, their allelic variants and the expressed immune repertoire will be facilitated. Although containing less than half the previously recognised IG alleles (e.g. just 198 IGHV sequences), the Reference Sets eliminated erroneous calls and provided excellent coverage when tested on a set of repertoires from 99 individuals comprising over 4 million V(D)J rearrangements. To improve AIRR-seq analysis, some alleles have been extended to deal with short 3’ or 5’ truncations that can lead them to be overlooked by alignment utilities. To avoid other challenges for analysis programs, exact paralogs (e.g. IGHV1-69*01 and IGHV1-69D*01) are only represented once in each set, though alternative sequence names are noted in accompanying metadata. The Reference Sets also include novel alleles: 8 IGHV alleles, 2 IGKV alleles and 5 IGLV alleles. The version-tracked AIRR-C Reference Sets are freely available at the OGRDB website (https://ogrdb.airr-community.org/germline_sets/Human) and will be regularly updated to include newly-observed and previously-reported sequences that can be confirmed by new high-quality data.
Publisher
Cold Spring Harbor Laboratory
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献