Deep learning models are seeing increased use as methods to predict mutational effects or allowed mutations in proteins. The models commonly used for these purposes include large language models (LLMs) and 3D Convolutional Neural Networks (CNNs). These two model types have very different architectures and are commonly trained on different representations of proteins. LLMs make use of the transformer architecture and are trained purely on protein sequences whereas 3D CNNs are trained on voxelized representations of local protein structure. While comparable overall prediction accuracies have been reported for both types of models, it is not known to what extent these models make comparable specific predictions and/or generalize protein biochemistry in similar ways. Here, we perform a systematic comparison of two LLMs and two structure-based models (CNNs) and show that the different model types have distinct strengths and weaknesses. The overall prediction accuracies are largely uncorrelated between the sequence- and structure-based models. Overall, the two structure-based models are better at predicting buried aliphatic and hydrophobic residues whereas the two LLMs are better at predicting solvent-exposed polar and charged amino acids. Finally, we find that a combined model that takes the individual model predictions as input can leverage these individual model strengths and results in significantly improved overall prediction accuracy.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10055221 | PMC |
http://dx.doi.org/10.1101/2023.03.20.533508 | DOI Listing |
Stat Med
February 2025
Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, Texas.
Advances in next-generation sequencing technology have enabled the high-throughput profiling of metagenomes and accelerated microbiome studies. Recently, there has been a rise in quantitative studies that aim to decipher the microbiome co-occurrence network and its underlying community structure based on metagenomic sequence data. Uncovering the complex microbiome community structure is essential to understanding the role of the microbiome in disease progression and susceptibility.
View Article and Find Full Text PDFNurs Rep
December 2024
Department of Microbiology, Parasitology and Virology, Faculty of Midwives and Nursing, "Carol Davila" University of Medicine and Pharmacy, 020021 Bucharest, Romania.
Introduction: Pregnant women's experiences and concerns regarding childbirth are complex, necessitating a multidimensional and personalized approach in maternal care. This study explores the psychological and emotional factors influencing pregnant women's decisions regarding their mode of delivery. The results will provide valuable insights for the development of educational and counseling strategies designed to support pregnant women in making informed and conscious decisions about their childbirth.
View Article and Find Full Text PDFIn Silico Pharmacol
January 2025
Laboratory of Drug Discovery and Ecotoxicology, Department of Pharmacy, Guru Ghasidas Vishwavidyalaya, Bilaspur, 495009 India.
Abstract: Alzheimer's disease (AD) and Parkinson's disease (PD) are neurological conditions that primarily impact the elderly having distinctive traits and some similarities in terms of symptoms and progression. The multifactorial nature of AD and PD encourages exploring potentiality of multi-target therapy for addressing these conditions to conventional, the "one drug one target" strategy. This study highlights the searching of potential HDAC4 inhibitors through multiple screening approaches.
View Article and Find Full Text PDFCurr Opin Struct Biol
January 2025
Department of Biological Sciences, Seoul National University, Seoul 08826, Republic of Korea. Electronic address:
Proteome-scale interaction prediction is essential for understanding protein functions and disease mechanisms. Traditional experimental methods are often limited by scale and complexity, driving the need for computational approaches. Deep learning has emerged as a powerful tool, enabling high-throughput, accurate predictions of protein interactions.
View Article and Find Full Text PDFJ Mol Graph Model
January 2025
Department of Mathematics & Actuarial Science, B. S. Abdur Rahman Crescent Institute of Science and Technology, Chennai, Tamil Nadu, 600048, India. Electronic address:
Topological indices are numerical invariants that provide key insights into the structural properties of molecular graphs and are crucial in predicting physio-chemical and biological activities. This paper applies established computational methodologies for analyzing benzenoid networks and their application to polycyclic aromatic hydrocarbons (PAHs) through degree-based topological indices computed via M-polynomial and NM-polynomial approaches. By examining tessellations, including linear chain, hexagonal, rhomboidal, and triangular configurations alongside their line graphs, this work highlights the influence of molecular topology on biological activity.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!