Size and structure of the sequence space of repeat proteins.

PLoS Comput Biol

Laboratoire de physique de l'École normale supérieure (PSL University), CNRS, Sorbonne Université, and Université de Paris, 75005 Paris, France.

Published: August 2019

The coding space of protein sequences is shaped by evolutionary constraints set by requirements of function and stability. We show that the coding space of a given protein family-the total number of sequences in that family-can be estimated using models of maximum entropy trained on multiple sequence alignments of naturally occuring amino acid sequences. We analyzed and calculated the size of three abundant repeat proteins families, whose members are large proteins made of many repetitions of conserved portions of ∼30 amino acids. While amino acid conservation at each position of the alignment explains most of the reduction of diversity relative to completely random sequences, we found that correlations between amino acid usage at different positions significantly impact that diversity. We quantified the impact of different types of correlations, functional and evolutionary, on sequence diversity. Analysis of the detailed structure of the coding space of the families revealed a rugged landscape, with many local energy minima of varying sizes with a hierarchical structure, reminiscent of fustrated energy landscapes of spin glass in physics. This clustered structure indicates a multiplicity of subtypes within each family, and suggests new strategies for protein design.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6733475PMC
http://dx.doi.org/10.1371/journal.pcbi.1007282DOI Listing

Publication Analysis

Top Keywords

coding space
12
amino acid
12
repeat proteins
8
space protein
8
size structure
4
structure sequence
4
space
4
sequence space
4
space repeat
4
proteins coding
4

Similar Publications

Retinotopic biases in contextual feedback signals to V1 for object and scene processing.

Curr Res Neurobiol

June 2025

Centre for Cognitive Neuroimaging, School of Psychology and Neuroscience, College of Medical, Veterinary and Life Sciences, University of Glasgow, 62 Hillhead Street, Glasgow, G12 8QB, United Kingdom.

Identifying the objects embedded in natural scenes relies on recurrent processing between lower and higher visual areas. How is cortical feedback information related to objects and scenes organised in lower visual areas? The spatial organisation of cortical feedback converging in early visual cortex during object and scene processing could be retinotopically specific as it is coded in V1, or object centred as coded in higher areas, or both. Here, we characterise object and scene-related feedback information to V1.

View Article and Find Full Text PDF

Objectives: This data note presents a comprehensive geodatabase of cardiovascular disease (CVD) hospitalizations in Mashhad, Iran, alongside key environmental factors such as air pollutants, built environment indicators, green spaces, and urban density. Using a spatiotemporal dataset of over 52,000 hospitalized CVD patients collected over five years, the study supports approaches like advanced spatiotemporal modeling, artificial intelligence, and machine learning to predict high-risk CVD areas and guide public health interventions.

Data Description: This dataset includes detailed epidemiologic and geospatial information on CVD hospitalizations in Mashhad, Iran, from January 1, 2016, to December 31, 2020.

View Article and Find Full Text PDF

How are arbitrary sequences of verbal information retained and manipulated in working memory? Increasing evidence suggests that serial order in verbal WM is spatially coded and that spatial attention is involved in access and retrieval. Based on the idea that brain areas controlling spatial attention are also involved in oculomotor control, we used eye tracking to reveal how the spatial structure of serial order information is accessed in verbal working memory. In two experiments, participants memorized a sequence of auditory words in the correct order.

View Article and Find Full Text PDF

Biological memory networks are thought to store information by experience-dependent changes in the synaptic connectivity between assemblies of neurons. Recent models suggest that these assemblies contain both excitatory and inhibitory neurons (E/I assemblies), resulting in co-tuning and precise balance of excitation and inhibition. To understand computational consequences of E/I assemblies under biologically realistic constraints we built a spiking network model based on experimental data from telencephalic area Dp of adult zebrafish, a precisely balanced recurrent network homologous to piriform cortex.

View Article and Find Full Text PDF

Unlabelled: Neurophysiology studies propose that predictive coding is implemented via alpha/beta (8-30 Hz) rhythms that prepare specific pathways to process predicted inputs. This leads to a state of relative inhibition, reducing feedforward gamma (40-90 Hz) rhythms and spiking to predictable inputs. We refer to this model as predictive routing.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!