Determining the molecular function of enzymes discovered by genome sequencing represents a primary foundation for understanding many aspects of biology. Historically, classification of enzyme reactions has used the enzyme nomenclature system developed to describe the overall reactions performed by biochemically characterized enzymes, irrespective of their associated sequences. In contrast, functional classification and assignment for the millions of protein sequences of unknown function now available is largely done in two computational steps, first by similarity-based assignment of newly obtained sequences to homologous groups, followed by transferring to them the known functions of similar biochemically characterized homologs.
View Article and Find Full Text PDFA 4-oxalocrotonate tautomerase (4-OT) trimer has been isolated from Burkholderia lata, and a kinetic, mechanistic, and structural analysis has been performed. The enzyme is the third described oligomer state for 4-OT along with a homo- and heterohexamer. The 4-OT trimer is part of a small subset of sequences (133 sequences) within the 4-OT subgroup of the tautomerase superfamily (TSF).
View Article and Find Full Text PDFThe radical SAM superfamily contains over 100,000 homologous enzymes that catalyze a remarkably broad range of reactions required for life, including metabolism, nucleic acid modification, and biogenesis of cofactors. While the highly conserved SAM-binding motif responsible for formation of the key 5'-deoxyadenosyl radical intermediate is a key structural feature that simplifies identification of superfamily members, our understanding of their structure-function relationships is complicated by the modular nature of their structures, which exhibit varied and complex domain architectures. To gain new insight about these relationships, we classified the entire set of sequences into similarity-based subgroups that could be visualized using sequence similarity networks.
View Article and Find Full Text PDFUnlabelled: With ever-increasing amounts of sequence data available in both the primary literature and sequence repositories, there is a bottleneck in annotating molecular function to a sequence. This article describes the biocuration process and methods used in the structure-function linkage database (SFLD) to help address some of the challenges. We discuss how the hierarchy within the SFLD allows us to infer detailed functional properties for functionally diverse enzyme superfamilies in which all members are homologous, conserve an aspect of their chemical function and have associated conserved structural features that enable the chemistry.
View Article and Find Full Text PDFProtein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification-amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants.
View Article and Find Full Text PDFMarchantia polymorpha is a basal terrestrial land plant, which like most liverworts accumulates structurally diverse terpenes believed to serve in deterring disease and herbivory. Previous studies have suggested that the mevalonate and methylerythritol phosphate pathways, present in evolutionarily diverged plants, are also operative in liverworts. However, the genes and enzymes responsible for the chemical diversity of terpenes have yet to be described.
View Article and Find Full Text PDFEnzyme function prediction remains an important open problem. Though structure-based modeling, such as metabolite docking, can identify substrates of some enzymes, it is ill-suited to reactions that progress through a covalent intermediate. Here we investigated the ability of covalent docking to identify substrates that pass through such a covalent intermediate, focusing particularly on the haloalkanoate dehalogenase superfamily.
View Article and Find Full Text PDFUnderstanding how enzymes have evolved offers clues about their structure-function relationships and mechanisms. Here, we describe evolution of functionally diverse enzyme superfamilies, each representing a large set of sequences that evolved from a common ancestor and that retain conserved features of their structures and active sites. Using several examples, we describe the different structural strategies nature has used to evolve new reaction and substrate specificities in each unique superfamily.
View Article and Find Full Text PDFThe rapid advance in genome sequencing presents substantial challenges for protein functional assignment, with half or more of new protein sequences inferred from these genomes having uncertain assignments. The assignment of enzyme function in functionally diverse superfamilies represents a particular challenge, which we address through a combination of computational predictions, enzymology, and structural biology. Here we describe the results of a focused investigation of a group of enzymes in the enolase superfamily that are involved in epimerizing dipeptides.
View Article and Find Full Text PDFAs increasingly large amounts of data from genome and other sequencing projects become available, new approaches are needed to determine the functions of the proteins these genes encode. We show how large-scale computational analysis can help to address this challenge by linking functional information to sequence and structural similarities using protein similarity networks. Network analyses using three functionally diverse enzyme superfamilies illustrate the use of these approaches for facile updating and comparison of available structures for a large superfamily, for creation of functional hypotheses for metagenomic sequences, and to summarize the limits of our functional knowledge about even well studied superfamilies.
View Article and Find Full Text PDFDue to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation levels for molecular function in four public protein sequence databases (UniProtKB/Swiss-Prot, GenBank NR, UniProtKB/TrEMBL, and KEGG) for a model set of 37 enzyme families for which extensive experimental information is available.
View Article and Find Full Text PDFTo study the substrate specificity of enzymes, we use the amidohydrolase and enolase superfamilies as model systems; members of these superfamilies share a common TIM barrel fold and catalyze a wide range of chemical reactions. Here, we describe a collaboration between the Enzyme Specificity Consortium (ENSPEC) and the New York SGX Research Center for Structural Genomics (NYSGXRC) that aims to maximize the structural coverage of the amidohydrolase and enolase superfamilies. Using sequence- and structure-based protein comparisons, we first selected 535 target proteins from a variety of genomes for high-throughput structure determination by X-ray crystallography; 63 of these targets were not previously annotated as superfamily members.
View Article and Find Full Text PDFA kinetic study of the reversible deprotonation of methylnitroacetate (4H) by primary aliphatic amines, secondary alicyclic amines, hydroxide ion, and water in water at 25 degrees C and in 50% DMSO/50% water (v/v) at 20 degrees C is reported. Intrinsic rate constants, k0, determined by extrapolation or interpolation of Brønsted plots have been determined. In comparison to proton transfers involving other nitroalkanes, the intrinsic rate constants for 4H are exceptionally high; for example, log k0 for the reaction of 4H with secondary alicyclic amines in water (1.
View Article and Find Full Text PDFA spectroscopic and kinetic study of the reaction of methyl beta-methylthio-alpha-nitrocinnamate (4-SMe) with morpholine, piperidine, and hydroxide ion in 50% DMSO/50% water (v/v) at 20 degrees C is reported. The reactions of 4-SMe with piperidine in a pH range from 10.12 to 11.
View Article and Find Full Text PDFThe rates of hydrolysis of alpha-R-alpha-(methylthio)methylene Meldrum's acids (8-R with R = H, Me, Et, s-Bu, and t-Bu) were determined in basic and acidic solution in 50% DMSO-50% water (v/v) at 20 degrees C. In basic solution (KOH), nucleophilic attack to form a tetrahedral intermediate (T(OH)-) is rate limiting for all substrates (k1(OH)). In acidic solution (HCl) and at intermediate pH values (acetate buffers), water attack (k1(H2O) is rate limiting for 8-Me, 8-Et, and 8-s-Bu; the same is presumably the case for 8-t-Bu, but rates were too slow for accurate measurements at low pH.
View Article and Find Full Text PDFSuperfamily and family analyses provide an effective tool for the functional classification of proteins, but must be automated for use on large datasets. We describe a 'gold standard' set of enzyme superfamilies, clustered according to specific sequence, structure, and functional criteria, for use in the validation of family and superfamily clustering methods. The gold standard set represents four fold classes and differing clustering difficulties, and includes five superfamilies, 91 families, 4,887 sequences and 282 structures.
View Article and Find Full Text PDFThe study of mechanistically diverse enzyme superfamilies-collections of enzymes that perform different overall reactions but share both a common fold and a distinct mechanistic step performed by key conserved residues-helps elucidate the structure-function relationships of enzymes. We have developed a resource, the structure-function linkage database (SFLD), to analyze these structure-function relationships. Unique to the SFLD is its hierarchical classification scheme based on linking the specific partial reactions (or other chemical capabilities) that are conserved at the superfamily, subgroup, and family levels with the conserved structural elements that mediate them.
View Article and Find Full Text PDFA kinetic study of the acid-catalyzed loss of alkoxide and thiolate ions from alkoxide and thiolate ion adducts, respectively, of benzylidene Meldrum's acid (1-H), methoxybenzylidene Meldrum's acid (1-OMe), and thiomethoxybenzylidene Meldrum's acid (1-SMe) is reported. The reactions appear to be subject to general acid catalysis, although the catalytic effect of buffers is weak and the bulk of the reported data refers to H(+)-catalysis. alpha-Carbon protonation and, in some cases, protonation of one of the carbonyl oxygens to form an enol compete with alkoxide or thiolate ion expulsion.
View Article and Find Full Text PDF