Protein space is characterized by extensive recurrence, or "reuse," of parts, suggesting that new proteins and domains can evolve by mixing-and-matching of existing segments. From an evolutionary perspective, for a given combination to persist, the protein segments should presumably not only match geometrically but also dynamically communicate with each other to allow concerted motions that are key to function. Evidence from protein space supports the premise that domains indeed combine in this manner; we explore whether a similar phenomenon can be observed at the sub-domain level.
View Article and Find Full Text PDFAmyloids, protein, and peptide assemblies in various organisms are crucial in physiological and pathological processes. Their intricate structures, however, present significant challenges, limiting our understanding of their functions, regulatory mechanisms, and potential applications in biomedicine and technology. This study evaluated the AlphaFold2 ColabFold method's structure predictions for antimicrobial amyloids, using eight antimicrobial peptides (AMPs), including those with experimentally determined structures and AMPs known for their distinct amyloidogenic morphological features.
View Article and Find Full Text PDFThe emergence of novel proteins, beyond these that can be readily made by duplication and recombination of preexisting domains, is elusive. De novo emergence from random sequences is unlikely because the vast majority of random chains would not even fold, let alone function. An alternative explanation is that novel proteins emerge by duplication and fusion of pre-existing polypeptide segments.
View Article and Find Full Text PDFAccurate protein structure predictors use clusters of homologues, which disregard sequence specific effects. In this issue of Structure, Weißenow and colleagues report a deep learning-based tool, EMBER2, that efficiently predicts the distances in a protein structure from its amino acid sequence only. This approach should enable the analysis of mutation effects.
View Article and Find Full Text PDFPLoS Comput Biol
February 2022
As sequence and structure comparison algorithms gain sensitivity, the intrinsic interconnectedness of the protein universe has become increasingly apparent. Despite this general trend, β-trefoils have emerged as an uncommon counterexample: They are an isolated protein lineage for which few, if any, sequence or structure associations to other lineages have been identified. If β-trefoils are, in fact, remote islands in sequence-structure space, it implies that the oligomerizing peptide that founded the β-trefoil lineage itself arose de novo.
View Article and Find Full Text PDFUnderstanding how proteins evolved not only resolves mysteries of the past, but also helps address challenges of the future, particularly those relating to the design and engineering of new protein functions. Here we review the work of Dan S. Tawfik, one of the pioneers of this area, highlighting his seminal contributions in diverse fields such as protein design, high throughput screening, protein stability, fundamental enzyme-catalyzed reactions and promiscuity, that underpin biology and the origins of life.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
August 2021
Outer-membrane beta barrels (OMBBs) are found in the outer membrane of gram-negative bacteria and eukaryotic organelles. OMBBs fold as antiparallel β-sheets that close onto themselves, forming pores that traverse the membrane. Currently known structures include only one barrel, of 8 to 36 strands, per chain.
View Article and Find Full Text PDFThe deep learning revolution introduced a new and efficacious way to address computational challenges in a wide range of fields, relying on large data sets and powerful computational resources. In protein engineering, we consider the challenge of computationally predicting properties of a protein and designing sequences with these properties. Indeed, accurate and fast deep network oracles for different properties of proteins have been developed.
View Article and Find Full Text PDFThe vast majority of theoretically possible polypeptide chains do not fold, let alone confer function. Hence, protein evolution from preexisting building blocks has clear potential advantages over ab initio emergence from random sequences. In support of this view, sequence similarities between different proteins is generally indicative of common ancestry, and we collectively refer to such homologous sequences as "themes.
View Article and Find Full Text PDFEvolutionary processes that formed the current protein universe left their traces, among them homologous segments that recur, or are 'reused,' in multiple proteins. These reused segments, called 'themes,' can be found at various scales, the best known of which is the domain. Yet, recent studies have begun to focus on the evolutionary insights that can be derived from sub-domain-scale themes, which are candidates for traces of more ancient events.
View Article and Find Full Text PDFThis article is dedicated to the memory of Michael G. Rossmann. Dating back to the last universal common ancestor, P-loop NTPases and Rossmanns comprise the most ubiquitous and diverse enzyme lineages.
View Article and Find Full Text PDFClassical congenital adrenal hyperplasia (CAH) caused by pathogenic variants in the steroid 21-hydroxylase gene () is a severe life-threatening condition. We present a detailed investigation of the molecular and functional characteristics of a novel pathogenic variant in this gene. The patient, 46 XX newborn, was diagnosed with classical salt wasting CAH in the neonatal period after initially presenting with ambiguous genitalia.
View Article and Find Full Text PDFBackground: Coronavirus disease 2019 (COVID-19) and dengue fever are difficult to distinguish given shared clinical and laboratory features. Failing to consider COVID-19 due to false-positive dengue serology can have serious implications. We aimed to assess this possible cross-reactivity.
View Article and Find Full Text PDFProteins' interactions with ancient ligands may reveal how molecular recognition emerged and evolved. We explore how proteins recognize adenine: a planar rigid fragment found in the most common and ancient ligands. We have developed a computational pipeline that extracts protein-adenine complexes from the Protein Data Bank, structurally superimposes their adenine fragments, and detects the hydrogen bonds mediating the interaction.
View Article and Find Full Text PDFOuter membrane proteins (OMPs) are the proteins in the surface of Gram-negative bacteria. These proteins have diverse functions but a single topology: the β-barrel. Sequence analysis has suggested that this common fold is a β-hairpin repeat protein, and that amplification of the β-hairpin has resulted in 8-26-stranded barrels.
View Article and Find Full Text PDFPresent-day protein space is the result of 3.7 billion years of evolution, constrained by the underlying physicochemical qualities of the proteins. It is difficult to differentiate between evolutionary traces and effects of physicochemical constraints.
View Article and Find Full Text PDFThere are around 100 varieties of outer membrane proteins in each Gram-negative bacteria. All of these proteins have the same fold-an up-down β-barrel. It has been suggested that all membrane β-barrels excluding lysins are homologous.
View Article and Find Full Text PDFThe protein interface is key to understand protein function, providing a vital insight on how proteins interact with each other and with other molecules. Over the years, many computational methods to compare protein structures were developed, yet evaluating interface similarity remains a very difficult task. Here, we present PatchBag - a geometry based method for efficient comparison of protein surfaces and interfaces.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
October 2017
Proteins share similar segments with one another. Such "reused parts"-which have been successfully incorporated into other proteins-are likely to offer an evolutionary advantage over de novo evolved segments, as most of the latter will not even have the capacity to fold. To systematically explore the evolutionary traces of segment "reuse" across proteins, we developed an automated methodology that identifies reused segments from protein alignments.
View Article and Find Full Text PDFProtein function involves conformational changes, but often, for a given protein, only some of these conformations are known. The missing conformations could be predicted using the wealth of data in the PDB. Most PDB proteins have multiple structures, and proteins sharing one similar conformation often share others as well.
View Article and Find Full Text PDFIt can be informative to view biological data, e.g., protein-protein interactions within a large complex, in a network representation coupled with three-dimensional structural visualizations of individual molecular entities.
View Article and Find Full Text PDFTo explore protein space from a global perspective, we consider 9,710 SCOP (Structural Classification of Proteins) domains with up to 70% sequence identity and present all similarities among them as networks: In the "domain network," nodes represent domains, and edges connect domains that share "motifs," i.e., significantly sized segments of similar sequence and structure.
View Article and Find Full Text PDFMotivation: Structural knowledge, extracted from the Protein Data Bank (PDB), underlies numerous potential functions and prediction methods. The PDB, however, is highly biased: many proteins have more than one entry, while entire protein families are represented by a single structure, or even not at all. The standard solution to this problem is to limit the studies to non-redundant subsets of the PDB.
View Article and Find Full Text PDFIn the fifty years since the first atomic structure of a protein was revealed, tens of thousands of additional structures have been solved. Like all objects in biology, proteins structures show common patterns that seem to define family relationships. Classification of proteins structures, which started in the 1970s with about a dozen structures, has continued with increasing enthusiasm, leading to two main fold classifications, SCOP and CATH, as well as many additional databases.
View Article and Find Full Text PDF