Simulation of adaptive immune receptors and repertoires with complex immune information to guide the development and benchmarking of AIRR machine learning.

Maria Chernigovskaya Milena Pavlović Chakravarthi Kanduri Sofie Gielis Philippe A Robert Lonneke Scheffer Andrei Slabodkin Ingrid Hobæk Haff Pieter Meysman Gur Yaari Geir Kjetil Sandve Victor Greiff

Nucleic Acids Res

Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, 0372, Norway.

Published: January 2025

Machine learning (ML) has shown great potential in the adaptive immune receptor repertoire (AIRR) field. However, there is a lack of large-scale ground-truth experimental AIRR data suitable for AIRR-ML-based disease diagnostics and therapeutics discovery. Simulated ground-truth AIRR data are required to complement the development and benchmarking of robust and interpretable AIRR-ML methods where experimental data is currently inaccessible or insufficient. The challenge for simulated data to be useful is incorporating key features observed in experimental repertoires. These features, such as antigen or disease-associated immune information, cause AIRR-ML problems to be challenging. Here, we introduce LIgO, a software suite, which simulates AIRR data for the development and benchmarking of AIRR-ML methods. LIgO incorporates different types of immune information both on the receptor and the repertoire level and preserves native-like generation probability distribution. Additionally, LIgO assists users in determining the computational feasibility of their simulations. We show two examples where LIgO supports the development and validation of AIRR-ML methods: (i) how individuals carrying out-of-distribution immune information impacts receptor-level prediction performance and (ii) how immune information co-occurring in the same AIRs impacts the performance of conventional receptor-level encoding and repertoire-level classification approaches. LIgO guides the advancement and assessment of interpretable AIRR-ML methods.

Download full-text PDF	Source
http://dx.doi.org/10.1093/nar/gkaf025	DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11773363	PMC

Publication Analysis

Top Keywords

airr-ml methods

development benchmarking

airr data

adaptive immune

machine learning

immune receptor

receptor repertoire

interpretable airr-ml

immune

airr

Similar Publications

Simulation of adaptive immune receptors and repertoires with complex immune information to guide the development and benchmarking of AIRR machine learning.

Nucleic Acids Res

January 2025

Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, 0372, Norway.

Maria Chernigovskaya Milena Pavlović Chakravarthi Kanduri Sofie Gielis Philippe A Robert

View Article and Find Full Text PDF

Similar Publications

simAIRR: simulation of adaptive immune repertoires with realistic receptor sequence sharing for benchmarking of immune state prediction methods.

Gigascience

December 2022

Centre for Bioinformatics, Department of Informatics, University of Oslo, 0373 Oslo, Norway.

Chakravarthi Kanduri Lonneke Scheffer Milena Pavlović Knut Dagestad Rand Maria Chernigovskaya

Background: Machine learning (ML) has gained significant attention for classifying immune states in adaptive immune receptor repertoires (AIRRs) to support the advancement of immunodiagnostics and therapeutics. Simulated data are crucial for the rigorous benchmarking of AIRR-ML methods. Existing approaches to generating synthetic benchmarking datasets result in the generation of naive repertoires missing the key feature of many shared receptor sequences (selected for common antigens) found in antigen-experienced repertoires.

View Article and Find Full Text PDF

Similar Publications

Profiling the baseline performance and limits of machine learning models for adaptive immune receptor repertoire classification.

Gigascience

May 2022

Centre for Bioinformatics, Department of Informatics, University of Oslo, Oslo 0373, Norway.

Chakravarthi Kanduri Milena Pavlović Lonneke Scheffer Keshav Motwani Maria Chernigovskaya

Background: Machine learning (ML) methodology development for the classification of immune states in adaptive immune receptor repertoires (AIRRs) has seen a recent surge of interest. However, so far, there does not exist a systematic evaluation of scenarios where classical ML methods (such as penalized logistic regression) already perform adequately for AIRR classification. This hinders investigative reorientation to those scenarios where method development of more sophisticated ML approaches may be required.

View Article and Find Full Text PDF

Similar Publications

The immuneML ecosystem for machine learning analysis of adaptive immune receptor repertoires.

Nat Mach Intell

November 2021

Department of Informatics, University of Oslo, Norway.

Milena Pavlović Lonneke Scheffer Keshav Motwani Chakravarthi Kanduri Radmila Kompova

Article Synopsis

AIRR (Adaptive Immune Receptor Repertoires) are crucial for tracking immune responses, making them important in biomedical research.
Machine learning (ML) is useful for analyzing complex patterns in AIRR, but issues like reproducibility and transparency have slowed its adoption.
immuneML is a new open-source tool that simplifies the AIRR ML process and includes user-friendly interfaces, extensive documentation, and demonstrates its effectiveness through various applications in immune state prediction and antigen specificity.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!