A comprehensive benchmarking for evaluating TCR embeddings in modeling TCR-epitope interactions.

Xikang Feng Miaozhe Huo He Li Yongze Yang Yuepeng Jiang Liang He Shuai Cheng Li

Brief Bioinform

Department of Computer Science, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon Tong, Hong Kong, 999077, China.

Published: November 2024

The complexity of T cell receptor (TCR) sequences, particularly within the complementarity-determining region 3 (CDR3), requires efficient embedding methods for applying machine learning to immunology. While various TCR CDR3 embedding strategies have been proposed, the absence of their systematic evaluations created perplexity in the community. Here, we extracted CDR3 embedding models from 19 existing methods and benchmarked these models with four curated datasets by accessing their impact on the performance of TCR downstream tasks, including TCR-epitope binding affinity prediction, epitope-specific TCR identification, TCR clustering, and visualization analysis. We assessed these models utilizing eight downstream classifiers and five downstream clustering methods, with the performance measured by a diverse range of metrics for precision, robustness, and usability. Overall, handcrafted embeddings outperformed data-driven ones in modeling TCR-epitope interactions. To further refine our comparative findings, we developed an all-in-one TCR CDR3 embedding package comprising all evaluated embedding models. This package will assist users in easily selecting suitable embedding models for their data.

Download full-text PDF	Source
http://dx.doi.org/10.1093/bib/bbaf030	DOI Listing

Publication Analysis

Top Keywords

cdr3 embedding

embedding models

modeling tcr-epitope

tcr-epitope interactions

tcr cdr3

tcr

embedding

models

comprehensive benchmarking

benchmarking evaluating

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!