LMNglyPred: prediction of human N-linked glycosylation sites using embeddings from a pre-trained protein language model.

Subash C Pakhrin Suresh Pokharel Kiyoko F Aoki-Kinoshita Moriah R Beck Tarun K Dam Doina Caragea Dukka B Kc

Glycobiology

Department of Computer Science, College of Computing, Michigan Technological University, Houghton, MI 49931, USA.

Published: June 2023

Protein N-linked glycosylation is an important post-translational mechanism in Homo sapiens, playing essential roles in many vital biological processes. It occurs at the N-X-[S/T] sequon in amino acid sequences, where X can be any amino acid except proline. However, not all N-X-[S/T] sequons are glycosylated; thus, the N-X-[S/T] sequon is a necessary but not sufficient determinant for protein glycosylation. In this regard, computational prediction of N-linked glycosylation sites confined to N-X-[S/T] sequons is an important problem that has not been extensively addressed by the existing methods, especially in regard to the creation of negative sets and leveraging the distilled information from protein language models (pLMs). Here, we developed LMNglyPred, a deep learning-based approach, to predict N-linked glycosylated sites in human proteins using embeddings from a pre-trained pLM. LMNglyPred produces sensitivity, specificity, Matthews Correlation Coefficient, precision, and accuracy of 76.50, 75.36, 0.49, 60.99, and 75.74 percent, respectively, on a benchmark-independent test set. These results demonstrate that LMNglyPred is a robust computational tool to predict N-linked glycosylation sites confined to the N-X-[S/T] sequon.

Download full-text PDF	Source
http://dx.doi.org/10.1093/glycob/cwad033	DOI Listing

Publication Analysis

Top Keywords

n-linked glycosylation

glycosylation sites

n-x-[s/t] sequon

embeddings pre-trained

protein language

amino acid

n-x-[s/t] sequons

sites confined

confined n-x-[s/t]

predict n-linked

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!