Simulation of adaptive immune receptors and repertoires with complex immune information to guide the development and benchmarking of AIRR machine learning-Reference-Cited by-同舟云学术

Simulation of adaptive immune receptors and repertoires with complex immune information to guide the development and benchmarking of AIRR machine learning

Published:2023-10-23 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Chernigovskaya Maria^ORCID,Pavlović Milena^ORCID,Kanduri Chakravarthi^ORCID,Gielis Sofie^ORCID,Robert Philippe A.^ORCID,Scheffer Lonneke^ORCID,Slabodkin Andrei^ORCID,Haff Ingrid Hobæk,Meysman Pieter^ORCID,Yaari Gur^ORCID,Sandve Geir Kjetil^ORCID,Greiff Victor^ORCID

Abstract

AbstractMachine learning (ML) has shown great potential in the adaptive immune receptor repertoire (AIRR) field. However, there is a lack of large-scale ground-truth experimental AIRR data suitable for AIRR-ML-based disease diagnostics and therapeutics discovery. Simulated ground-truth AIRR data are required to complement the development and benchmarking of robust and interpretable AIRR-ML methods where experimental data is currently inaccessible or insufficient. The challenge for simulated data to be useful is incorporating key features observed in experimental repertoires. These features, such as antigen or disease-associated immune information, cause AIRR-ML problems to be challenging. Here, we introduce LIgO, a software suite, which simulates AIRR data for the development and benchmarking of AIRR-ML methods. LIgO incorporates different types of immune information both on the receptor and the repertoire level and preserves native-like generation probability distribution. Additionally, LIgO assists users in determining the computational feasibility of their simulations. We show two examples where LIgO supports the development and validation of AIRR-ML methods: (1) how individuals carrying out-of-distribution immune information impacts receptor-level prediction performance and (2) how immune information co-occurring in the same AIRs impacts the performance of conventional receptor-level encoding and repertoire-level classification approaches. LIgO guides the advancement and assessment of interpretable AIRR-ML methods.

Publisher

Cold Spring Harbor Laboratory

Reference125 articles.

1. Precursor Frequency and Affinity Determine B Cell Competitive Fitness in Germinal Centers, Tested with Germline-Targeting HIV Vaccine Immunogens

2. Convergent Sequence Features of Antiviral B Cells

3. Progress and Challenges for the Machine Learning-Based Design of Fit-for-Purpose Monoclonal Antibodies;mAbs,2022

4. A Compact Vocabulary of Paratope-Epitope Interactions Enables Predictability of Antibody-Antigen Binding;Cell Reports,2021

5. In Silico Proof of Principle of Machine Learning-Based Antibody Design at Unconstrained Scale;mAbs,2022

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Best practices for machine learning in antibody discovery and development;Drug Discovery Today;2024-07

2. Adaptive immune receptor repertoire analysis;Nature Reviews Methods Primers;2024-01-25