An unresolved question in the origin and evolution of life is whether a continuous path from geochemical precursors to the majority of molecules in the biosphere can be reconstructed from modern-day biochemistry. Here we identified a feasible path by simulating the evolution of biosphere-scale metabolism, using only known biochemical reactions and models of primitive coenzymes. We find that purine synthesis constitutes a bottleneck for metabolic expansion, which can be alleviated by non-autocatalytic phosphoryl coupling agents.
View Article and Find Full Text PDFNat/Ivy is a diverse and ubiquitous CoA-binding evolutionary lineage that catalyzes acyltransferase reactions, primarily converting thioesters into amides. At the heart of the Nat/Ivy fold is a phosphate-binding loop that bears a striking resemblance to that of P-loop NTPases-both are extended, glycine-rich loops situated between a β-strand and an α-helix. Nat/Ivy, therefore, represents an intriguing intersection between thioester chemistry, a putative primitive energy currency, and an ancient mode of phospho-ligand binding.
View Article and Find Full Text PDFAnthropogenic organophosphorus compounds (AOPCs), such as phosphotriesters, are used extensively as plasticizers, flame retardants, nerve agents, and pesticides. To date, only a handful of soil bacteria bearing a phosphotriesterase (PTE), the key enzyme in the AOPC degradation pathway, have been identified. Therefore, the extent to which bacteria are capable of utilizing AOPCs as a phosphorus source, and how widespread this adaptation may be, remains unclear.
View Article and Find Full Text PDFPeptide-RNA coacervates can result in the concentration and compartmentalization of simple biopolymers. Given their primordial relevance, peptide-RNA coacervates may have also been a key site of early protein evolution. However, the extent to which such coacervates might promote or suppress the exploration of novel peptide conformations is fundamentally unknown.
View Article and Find Full Text PDFPLoS Comput Biol
February 2022
As sequence and structure comparison algorithms gain sensitivity, the intrinsic interconnectedness of the protein universe has become increasingly apparent. Despite this general trend, β-trefoils have emerged as an uncommon counterexample: They are an isolated protein lineage for which few, if any, sequence or structure associations to other lineages have been identified. If β-trefoils are, in fact, remote islands in sequence-structure space, it implies that the oligomerizing peptide that founded the β-trefoil lineage itself arose de novo.
View Article and Find Full Text PDFAmong the enzyme lineages that undoubtedly emerged prior to the last universal common ancestor is the so-called HUP, which includes Class I aminoacyl tRNA synthetases (AARSs) as well as enzymes mediating NAD, FAD, and CoA biosynthesis. Here, we provide a detailed analysis of HUP evolution, from emergence to structural and functional diversification. The HUP is a nucleotide binding domain that uniquely catalyzes adenylation via the release of pyrophosphate.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
April 2021
The P-loop Walker A motif underlies hundreds of essential enzyme families that bind nucleotide triphosphates (NTPs) and mediate phosphoryl transfer (P-loop NTPases), including the earliest DNA/RNA helicases, translocases, and recombinases. What were the primordial precursors of these enzymes? Could these large and complex proteins emerge from simple polypeptides? Previously, we showed that P-loops embedded in simple βα repeat proteins bind NTPs but also, unexpectedly so, ssDNA and RNA. Here, we extend beyond the purely biophysical function of ligand binding to demonstrate rudimentary helicase-like activities.
View Article and Find Full Text PDFThis article is dedicated to the memory of Michael G. Rossmann. Dating back to the last universal common ancestor, P-loop NTPases and Rossmanns comprise the most ubiquitous and diverse enzyme lineages.
View Article and Find Full Text PDFPolyamines are known to mediate diverse biological processes, and specifically to bind and stabilize compact conformations of nucleic acids, acting as chemical chaperones that promote folding by offsetting the repulsive negative charges of the phosphodiester backbone. However, whether and how polyamines modulate the structure and function of proteins remain unclear. In particular, early proteins are thought to have been highly acidic, like nucleic acids, due to a scarcity of basic amino acids in the prebiotic context.
View Article and Find Full Text PDFDe novo emergence demands a transition from disordered polypeptides into structured proteins with well-defined functions. However, can polypeptides confer functions of evolutionary relevance, and how might such polypeptides evolve into modern proteins? The earliest proteins present an even greater challenge, as they were likely based on abiotic, spontaneously synthesized amino acids. Here we asked whether a primordial function, such as nucleic acid binding, could emerge with ornithine, a basic amino acid that forms abiotically yet is absent in modern-day proteins.
View Article and Find Full Text PDFMany protein architectures exhibit evidence of internal rotational symmetry postulated to be the result of gene duplication/fusion events involving a primordial polypeptide motif. A common feature of such structures is a domain-swapped arrangement at the interface of the N- and C-termini motifs and postulated to provide cooperative interactions that promote folding and stability. De novo designed symmetric protein architectures have demonstrated an ability to accommodate circular permutation of the N- and C-termini in the overall architecture; however, the folding requirement of the primordial motif is poorly understood, and tolerance to circular permutation is essentially unknown.
View Article and Find Full Text PDFThe ubiquity of phospho-ligands suggests that phosphate binding emerged at the earliest stage of protein evolution. To evaluate this hypothesis and unravel its details, we identified all phosphate-binding protein lineages in the Evolutionary Classification of Protein Domains database. We found at least 250 independent evolutionary lineages that bind small molecule cofactors and metabolites with phosphate moieties.
View Article and Find Full Text PDFEnzymes catalyze a vast range of reactions. Their catalytic performances, mechanisms, global folds, and active-site architectures are also highly diverse, suggesting that enzymes are shaped by an entire range of physiological demands and evolutionary constraints, as well as by chemical and physicochemical constraints. We have attempted to identify signatures of these shaping demands and constraints.
View Article and Find Full Text PDFAn efficient protein-folding pathway leading to target structure, and the avoidance of aggregation, is essential to protein evolution and de novo design; however, design details to achieve efficient folding and avoid aggregation are poorly understood. We report characterization of the thermally-induced aggregate of fibroblast growth factor-1 (FGF-1), a small globular protein, by solid-state NMR. NMR spectra are consistent with residual structure in the aggregate and provide evidence of a structured region that corresponds to the region of the folding nucleus.
View Article and Find Full Text PDFThe linkage between regulatory elements of transcription, such as promoters, and their protein products is central to gene function. Promoter-protein coevolution is therefore expected, but rarely observed, and the manner by which these two regulatory levels are linked remains largely unknown. We study glutamate dehydrogenase-a hub of carbon and nitrogen metabolism.
View Article and Find Full Text PDFThe folding nucleus (FN) is a cryptic element within protein primary structure that enables an efficient folding pathway and is the postulated heritable element in the evolution of protein architecture; however, almost nothing is known regarding how the FN structurally changes as complex protein architecture evolves from simpler peptide motifs. We report characterization of the FN of a designed purely symmetric β-trefoil protein by ϕ-value analysis. We compare the structure and folding properties of key foldable intermediates along the evolutionary trajectory of the β-trefoil.
View Article and Find Full Text PDFType I collagen is composed of 2 polypeptides, α1(I) and α2(I), which fold into triple helix. Collagen α1(I) and α2(I) mRNAs have a conserved stem-loop structure in their 5' UTRs, the 5'SL. LARP6 binds the 5'SL to regulate type I collagen expression.
View Article and Find Full Text PDFJ Pharm Sci
February 2015
Buried free-cysteine (Cys) residues can contribute to an irreversible unfolding pathway that promotes protein aggregation, increases immunogenic potential, and significantly reduces protein functional half-life. Consequently, mutation of buried free-Cys residues can result in significant improvement in the storage, reconstitution, and pharmacokinetic properties of protein-based therapeutics. Mutational design to eliminate buried free-Cys residues typically follows one of two common heuristics: either substitution by Ser (polar and isosteric), or substitution by Ala or Val (hydrophobic); however, a detailed structural and thermodynamic understanding of Cys mutations is lacking.
View Article and Find Full Text PDFThe halophile environment has a number of compelling aspects with regard to the origin of structured polypeptides (i.e., proteogenesis) and, instead of a curious niche that living systems adapted into, the halophile environment is emerging as a candidate "cradle" for proteogenesis.
View Article and Find Full Text PDFModels of symmetric protein evolution typically invoke gene duplication and fusion events, in which repetition of a structural motif generates foldable, stable symmetric protein architecture. Success of such evolutionary processes suggests that the duplicated structural motif must be capable of nucleating protein folding. If correct, symmetric expansion of a folding nucleus sequence derived from an extant symmetric fold may be an elegant and computationally tractable solution to de novo protein design.
View Article and Find Full Text PDFTop-down symmetric deconstruction (TDSD) is a joint experimental and computational approach to generate a highly stable, functionally benign protein scaffold for intended application in subsequent functional design studies. By focusing on symmetric protein folds, TDSD can leverage the dramatic reduction in sequence space achieved by applying a primary structure symmetric constraint to the design process. Fundamentally, TDSD is an iterative symmetrization process, in which the goal is to maintain or improve properties of thermodynamic stability and folding cooperativity inherent to a starting sequence (the "proxy").
View Article and Find Full Text PDF