Exploiting sequence-structure-function relationships in biotechnology requires improved methods for aligning proteins that have low sequence similarity to previously annotated proteins. We develop two deep learning methods to address this gap, TM-Vec and DeepBLAST. TM-Vec allows searching for structure-structure similarities in large sequence databases.
View Article and Find Full Text PDFThe ability of Mycobacterium tuberculosis (Mtb) to persist in its host may enable an evolutionary advantage for drug resistant variants to emerge. A potential strategy to prevent persistence and gain drug efficacy is to directly target the activity of enzymes that are crucial for persistence. We present a method for expedited discovery and structure-based design of lead compounds by targeting the hypoxia-associated enzyme L-alanine dehydrogenase (AlaDH).
View Article and Find Full Text PDFWe report a critical advance in the generation and characterization of peptoid hetero-oligomers. A library of sub-monomers with amine and carboxylate side-chains are combined in different sequences using microwave-assisted synthesis. Their sequence-structure propensity is confirmed by circular dichroism, and conformer subtypes are enumerated by NMR.
View Article and Find Full Text PDFWhile natural protein-protein interactions have evolved to be induced by complex stimuli, rational design of interactions that can be switched-on-demand still remain challenging in the protein design world. Here, we demonstrate that a computationally redesigned natural interface for improved binding affinity could further be mutated to adopt a pH switchable interaction. The redesigned interface of Protein G/human IgG Fc domain (referred to as PrG/hIgG), when incorporated with histidine and glutamic acid on PrG (PrG-EHHE), showed a switch in binding affinity by 50-fold when the pH was altered from mild acidic to mild basic.
View Article and Find Full Text PDFEnzyme engineering for gain of function requires navigating a large combinatorial sequence space efficiently. Typically, many mutations are needed to get significant improvements, while a single "bad" mutation can inactivate the enzyme. To establish high-throughput screening and achieve enhanced resolution between two variants, genetic libraries of the organophosphate hydrolase enzyme paraoxonase 1 (PON1) were rapidly screened an engineered positive-feedback circuit: a -nitrophenol (PNP)-specific transcription factor (TF) regulated expression of PON1, which catalyzed paraoxon breakdown and PNP production.
View Article and Find Full Text PDFMany scientific disciplines rely on computational methods for data analysis, model generation, and prediction. Implementing these methods is often accomplished by researchers with domain expertise but without formal training in software engineering or computer science. This arrangement has led to underappreciation of sustainability and maintainability of scientific software tools developed in academic environments.
View Article and Find Full Text PDFProduct inhibition is a frequent bottleneck in industrial enzymes, and testing mutations to alleviate product inhibition via traditional methods remains challenging as many variants need to be tested against multiple substrate and product concentrations. Further, traditional screening methods are conducted in vitro, and resulting enzyme variants may perform differently in vivo in the context of whole-cell metabolism and regulation. In this study, we address these two problems by establishing a high-throughput screening method to alleviate product inhibition in an industrially relevant enzyme, chorismate pyruvate-lyase (UbiC).
View Article and Find Full Text PDFRobust fluorescence-based biosensors are emerging as critical tools for high-throughput strain improvement in synthetic biology. Many biosensors are developed in model organisms where sophisticated synthetic biology tools are also well established. However, industrial biochemical production often employs microbes with phenotypes that are advantageous for a target process, and biosensors may fail to directly transition outside the host in which they are developed.
View Article and Find Full Text PDFACS Synth Biol
January 2017
Thermostabilization of an enzyme with complete retention of catalytic efficiency was demonstrated on recombinant 3-dehydroshikimate dehydratase (DHSase or wtAsbF) from Bacillus thuringiensis serovar konkukian 97-27 (hereafter, B. thuringiensis 97-27). The wtAsbF is relatively unstable at 37 °C, in vitro (t = 15 min), in the absence of divalent metal.
View Article and Find Full Text PDFA whole-cell biosensor utilizing a transcription factor (TF) is an effective tool for sensitive and selective detection of specialty chemicals or anthropogenic molecules, but requires access to an expanded repertoire of TFs. Using homology modeling and ligand docking for binding pocket identification, assisted by conservative mutations in the pocket, we engineered a novel specificity in an Acinetobacter TF, PobR, to 'sense' a chemical p-nitrophenol (pNP) and measured the response via a fluorescent protein reporter expressed from a PobR promoter. Out of 10(7) variants of PobR, four were active when dosed with pNP, with two mutants showing a specificity switch from the native effector 4-hydroxybenzoate (4HB).
View Article and Find Full Text PDFStructure-based rational mutagenesis for engineering protein functionality has been limited by the scarcity and difficulty of obtaining crystal structures of desired proteins. On the other hand, when high-throughput selection is possible, directed evolution-based approaches for gaining protein functionalities have been random and fortuitous with limited rationalization. We combine comparative modeling of dimer structures, ab initio loop reconstruction, and ligand docking to select positions for mutagenesis to create a library focused on the ligand-contacting residues.
View Article and Find Full Text PDFWe created a single cell sorting system to screen for enzyme activity in Escherichia coli producing 3,4 dihydroxy benzoate (34DHB). To do so, we engineered a transcription factor regulon controlling the expression of green fluorescent protein (GFP) for induction by 34DHB. An autoregulated transcription factor, pcaU, was borrowed from Acinetobacter sp ADP1 to E.
View Article and Find Full Text PDFProtein G is an IgG binding protein that has been widely exploited for biotechnological purposes. Rosetta protein modeling identified a set of favorable polar mutations in Protein G, at its binding interface with the Fc domain of Immunoglobulin G, that were predicted to increase the stability and tighten the binding relative to native Protein G, with only a minor perturbation of the binding mode seen in the crystal structure. This triple mutant was synthesized and evaluated experimentally.
View Article and Find Full Text PDFThe p21-activated kinase (PAK) family plays a versatile role in cell signaling by forming a hub of interactions. PAKs bind the GTPases like RAC and CDC42. Their proline-rich motifs bind SH3 adaptor proteins such as PIX and NCK.
View Article and Find Full Text PDFUnderstanding how binding of a transcription factor to an operator is influenced by the operator sequence is an ongoing quest. It facilitates discovery of alternative binding sites as well as tuning of transcriptional regulation. We investigated the behavior of the Escherichia coli Lac repressor (LacI) protein with a large set of lac O(1) operator variants.
View Article and Find Full Text PDFThe Rosetta de novo structure prediction and loop modeling protocols begin with coarse grained Monte Carlo searches in which the moves are based on short fragments extracted from a database of known structures. Here we describe a new object oriented program for picking fragments that greatly extends the functionality of the previous program (nnmake) and opens the door for new approaches to structure modeling. We provide a detailed description of the code design and architecture, highlighting its modularity, and new features such as extensibility, total control over the fragment picking workflow and scoring system customization.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
October 2008
Naturally occurring homooligomeric protein complexes exhibit striking internal symmetry. The evolutionary origins of this symmetry have been the subject of considerable speculation; proposals for the advantages associated with symmetry include greater folding efficiency, reduced aggregation, amenability to allosteric regulation, and greater adaptability. An alternative possibility stems from the idea that to contribute to fitness, and hence be subject to evolutionary optimization, a complex must be significantly populated, which implies that the interaction energy between monomers in the ancestors of modern-day complexes must have been sufficient to at least partially overcome the entropic cost of association.
View Article and Find Full Text PDFJ Bioinform Comput Biol
February 2008
Cryoelectron microscopy (cryoEM) is an experimental technique to determine the three-dimensional (3D) structure of large protein complexes. Currently, this technique is able to generate protein density maps at 6-9 A resolution, at which the skeleton of the structure (which is composed of alpha-helices and beta-sheets) can be visualized. As a step towards predicting the entire backbone of the protein from the protein density map, we developed a method to predict the topology and sequence alignment for the skeleton helices.
View Article and Find Full Text PDFSaccharomyces cerevisiae is one of the best-studied model organisms, yet the three-dimensional structure and molecular function of many yeast proteins remain unknown. Yeast proteins were parsed into 14,934 domains, and those lacking sequence similarity to proteins of known structure were folded using the Rosetta de novo structure prediction method on the World Community Grid. This structural data was integrated with process, component, and function annotations from the Saccharomyces Genome Database to assign yeast protein domains to SCOP superfamilies using a simple Bayesian approach.
View Article and Find Full Text PDFMany applications require a method for translating a large list of bond angles and bond lengths to precise atomic Cartesian coordinates. This simple but computationally consuming task occurs ubiquitously in modeling proteins, DNA, and other polymers as well as in many other fields such as robotics. To find an optimal method, algorithms can be compared by a number of operations, speed, intrinsic numerical stability, and parallelization.
View Article and Find Full Text PDFA major limitation of current comparative modeling methods is the accuracy with which regions that are structurally divergent from homologues of known structure can be modeled. Because structural differences between homologous proteins are responsible for variations in protein function and specificity, the ability to model these differences has important functional consequences. Although existing methods can provide reasonably accurate models of short loop regions, modeling longer structurally divergent regions is an unsolved problem.
View Article and Find Full Text PDFThe U.S. Department of Energy recently announced the first five grants for the Genomes to Life (GTL) Program.
View Article and Find Full Text PDFAdvances in structural genomics and protein structure prediction require the design of automatic, fast, objective, and well benchmarked methods capable of comparing and assessing the similarity of low-resolution three-dimensional structures, via experimental or theoretical approaches. Here, a new method for sequence-independent structural alignment is presented that allows comparison of an experimental protein structure with an arbitrary low-resolution protein tertiary model. The heuristic algorithm is given and then used to show that it can describe random structural alignments of proteins with different folds with good accuracy by an extreme value distribution.
View Article and Find Full Text PDF