PeptideBERT: A Language Model Based on Transformers for Peptide Property Prediction.

J Phys Chem Lett

Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States.

Published: November 2023

Recent advances in language models have enabled the protein modeling community with a powerful tool that uses transformers to represent protein sequences as text. This breakthrough enables a sequence-to-property prediction for peptides without relying on explicit structural data. Inspired by the recent progress in the field of large language models, we present PeptideBERT, a protein language model specifically tailored for predicting essential peptide properties such as hemolysis, solubility, and nonfouling. The PeptideBERT utilizes the ProtBERT pretrained transformer model with 12 attention heads and 12 hidden layers. Through fine-tuning the pretrained model for the three downstream tasks, our model is state of the art (SOTA) in predicting hemolysis, which is crucial for determining a peptide's potential to induce red blood cells as well as nonfouling properties. Leveraging primarily shorter sequences and a data set with negative samples predominantly associated with insoluble peptides, our model showcases remarkable performance.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10683064PMC
http://dx.doi.org/10.1021/acs.jpclett.3c02398DOI Listing

Publication Analysis

Top Keywords

language model
8
language models
8
model
6
peptidebert language
4
model based
4
based transformers
4
transformers peptide
4
peptide property
4
property prediction
4
prediction advances
4

Similar Publications

Accurate classification of logos is a challenging task in image recognition due to variations in logo size, orientation, and background complexity. Deep learning models, such as VGG16, have demonstrated promising results in handling such tasks. However, their performance is highly dependent on optimal hyperparameter settings, whose fine-tuning is both labor-intensive and time-consuming.

View Article and Find Full Text PDF

With breakthroughs in Natural Language Processing and Artificial Intelligence (AI), the usage of Large Language Models (LLMs) in academic research has increased tremendously. Models such as Generative Pre-trained Transformer (GPT) are used by researchers in literature review, abstract screening, and manuscript drafting. However, these models also present the attendant challenge of providing ethically questionable scientific information.

View Article and Find Full Text PDF

Evaluating large language models for criterion-based grading from agreement to consistency.

NPJ Sci Learn

December 2024

Department of Psychology, Jeffrey Cheah School of Medicine and Health Sciences, Monash University Malaysia, Bandar Sunway, 475000, Malaysia.

This study evaluates the ability of large language models (LLMs) to deliver criterion-based grading and examines the impact of prompt engineering with detailed criteria on grading. Using well-established human benchmarks and quantitative analyses, we found that even free LLMs achieve criterion-based grading with a detailed understanding of the criteria, underscoring the importance of domain-specific understanding over model complexity. These findings highlight the potential of LLMs to deliver scalable educational feedback.

View Article and Find Full Text PDF

Deep neural networks drive the success of natural language processing. A fundamental property of language is its compositional structure, allowing humans to systematically produce forms for new meanings. For humans, languages with more compositional and transparent structures are typically easier to learn than those with opaque and irregular structures.

View Article and Find Full Text PDF

Resilience is central to young children's healthy and happy development. The Child and Youth Resilience Measure (CYRM-R) has been widely used in several countries. However, its construct validity among young children in rural South Africa has not been examined.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!