AI Article Synopsis

Article Abstract

Deep learning models are seeing increased use as methods to predict mutational effects or allowed mutations in proteins. The models commonly used for these purposes include large language models (LLMs) and 3D Convolutional Neural Networks (CNNs). These two model types have very different architectures and are commonly trained on different representations of proteins. LLMs make use of the transformer architecture and are trained purely on protein sequences whereas 3D CNNs are trained on voxelized representations of local protein structure. While comparable overall prediction accuracies have been reported for both types of models, it is not known to what extent these models make comparable specific predictions and/or generalize protein biochemistry in similar ways. Here, we perform a systematic comparison of two LLMs and two structure-based models (CNNs) and show that the different model types have distinct strengths and weaknesses. The overall prediction accuracies are largely uncorrelated between the sequence- and structure-based models. Overall, the two structure-based models are better at predicting buried aliphatic and hydrophobic residues whereas the two LLMs are better at predicting solvent-exposed polar and charged amino acids. Finally, we find that a combined model that takes the individual model predictions as input can leverage these individual model strengths and results in significantly improved overall prediction accuracy.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10055221PMC
http://dx.doi.org/10.1101/2023.03.20.533508DOI Listing

Publication Analysis

Top Keywords

structure-based models
16
models
9
sequence- structure-based
8
protein biochemistry
8
cnns model
8
model types
8
prediction accuracies
8
better predicting
8
individual model
8
model
5

Similar Publications

Advances in next-generation sequencing technology have enabled the high-throughput profiling of metagenomes and accelerated microbiome studies. Recently, there has been a rise in quantitative studies that aim to decipher the microbiome co-occurrence network and its underlying community structure based on metagenomic sequence data. Uncovering the complex microbiome community structure is essential to understanding the role of the microbiome in disease progression and susceptibility.

View Article and Find Full Text PDF

Introduction: Pregnant women's experiences and concerns regarding childbirth are complex, necessitating a multidimensional and personalized approach in maternal care. This study explores the psychological and emotional factors influencing pregnant women's decisions regarding their mode of delivery. The results will provide valuable insights for the development of educational and counseling strategies designed to support pregnant women in making informed and conscious decisions about their childbirth.

View Article and Find Full Text PDF

Abstract: Alzheimer's disease (AD) and Parkinson's disease (PD) are neurological conditions that primarily impact the elderly having distinctive traits and some similarities in terms of symptoms and progression. The multifactorial nature of AD and PD encourages exploring potentiality of multi-target therapy for addressing these conditions to conventional, the "one drug one target" strategy. This study highlights the searching of potential HDAC4 inhibitors through multiple screening approaches.

View Article and Find Full Text PDF

Deep learning methods for proteome-scale interaction prediction.

Curr Opin Struct Biol

January 2025

Department of Biological Sciences, Seoul National University, Seoul 08826, Republic of Korea. Electronic address:

Proteome-scale interaction prediction is essential for understanding protein functions and disease mechanisms. Traditional experimental methods are often limited by scale and complexity, driving the need for computational approaches. Deep learning has emerged as a powerful tool, enabling high-throughput, accurate predictions of protein interactions.

View Article and Find Full Text PDF

On topological characterizations and computational analysis of benzenoid networks for drug discovery and development.

J Mol Graph Model

January 2025

Department of Mathematics & Actuarial Science, B. S. Abdur Rahman Crescent Institute of Science and Technology, Chennai, Tamil Nadu, 600048, India. Electronic address:

Topological indices are numerical invariants that provide key insights into the structural properties of molecular graphs and are crucial in predicting physio-chemical and biological activities. This paper applies established computational methodologies for analyzing benzenoid networks and their application to polycyclic aromatic hydrocarbons (PAHs) through degree-based topological indices computed via M-polynomial and NM-polynomial approaches. By examining tessellations, including linear chain, hexagonal, rhomboidal, and triangular configurations alongside their line graphs, this work highlights the influence of molecular topology on biological activity.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!