Contrastive learning of T cell receptor representations.

Cell Syst

Division of Infection and Immunity, University College London, London WC1E 6BT, UK; Institute for the Physics of Living Systems, University College London, London WC1E 6BT, UK. Electronic address:

Published: January 2025

Computational prediction of the interaction of T cell receptors (TCRs) and their ligands is a grand challenge in immunology. Despite advances in high-throughput assays, specificity-labeled TCR data remain sparse. In other domains, the pre-training of language models on unlabeled data has been successfully used to address data bottlenecks. However, it is unclear how to best pre-train protein language models for TCR specificity prediction. Here, we introduce a TCR language model called SCEPTR (simple contrastive embedding of the primary sequence of T cell receptors), which is capable of data-efficient transfer learning. Through our model, we introduce a pre-training strategy combining autocontrastive learning and masked-language modeling, which enables SCEPTR to achieve its state-of-the-art performance. In contrast, existing protein language models and a variant of SCEPTR pre-trained without autocontrastive learning are outperformed by sequence alignment-based methods. We anticipate that contrastive learning will be a useful paradigm to decode the rules of TCR specificity. A record of this paper's transparent peer review process is included in the supplemental information.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cels.2024.12.006DOI Listing

Publication Analysis

Top Keywords

language models
12
contrastive learning
8
t cell receptors
8
protein language
8
tcr specificity
8
autocontrastive learning
8
learning t cell
4
t cell receptor
4
receptor representations
4
representations computational
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!