Trends Biotechnol
March 2016
Building robust manufacturing processes from biological components is a task that is highly complex and requires sophisticated tools to describe processes, inputs, and measurements and administrate management of knowledge, data, and materials. We argue that for bioengineering to fully access biological potential, it will require application of statistically designed experiments to derive detailed empirical models of underlying systems. This requires execution of large-scale structured experimentation for which laboratory automation is necessary.
View Article and Find Full Text PDFRecent work has led to a substantial improvement in the accuracy of predictions of contacts between amino acids using evolutionary information derived from multiple sequence alignments. Where large numbers of diverse sequence relatives are available and can be aligned to the sequence of a protein of unknown structure, it is now possible to generate high-resolution models without recourse to the structure of a template. In this review, we describe these exciting new techniques and critically assess the state of the art in contact prediction in light of them.
View Article and Find Full Text PDFRecent work has led to a substantial improvement in the accuracy of predictions of contacts between amino acids using evolutionary information derived from multiple sequence alignments. Where large numbers of diverse sequence relatives are available and can be aligned to the sequence of a protein of unknown structure it is now possible to generate high-resolution models without recourse to the structure of a template. In this review we describe these exciting new techniques and critically assess the state-of-the-art in contact prediction in the light of these.
View Article and Find Full Text PDFIt has been known even since relatively few structures had been solved that longer protein chains often contain multiple domains, which may fold separately and play the role of reusable functional modules found in many contexts. In many structural biology tasks, in particular structure prediction, it is of great use to be able to identify domains within the structure and analyze these regions separately. However, when using sequence data alone this task has proven exceptionally difficult, with relatively little improvement over the naive method of choosing boundaries based on size distributions of observed domains.
View Article and Find Full Text PDFA general understanding of the complex phenomenon of protein evolution requires the accurate description of the constraints that define the sub-space of proteins with mutations that do not appreciably reduce the fitness of the organism. Such constraints can have multiple origins, in this work we present a model for constrained evolutionary trajectories represented by a markovian process throughout a set of protein-like structures artificially constructed to be topological intermediates between the structure of two natural occurring proteins. The number and type of intermediate steps defines how constrained the total evolutionary process is.
View Article and Find Full Text PDFResidue contact predictions were calculated based on the mutual information observed between pairs of positions in large multiple protein sequence alignments. Where previously only the statistical properties of these data have been considered important, we introduce new measures to impose constraints that make the contact map more consistent with a three dimensional structure. These included global (bulk) properties and local secondary structure properties.
View Article and Find Full Text PDFResidue contacts predicted from correlated positions in a multiple sequence alignment are often sparse and uncertain. To some extent, these limitations in the data can be overcome by grouping the contacts by secondary structure elements and enumerating the possible packing arrangements of these elements in a combinatorial manner. Strong interactions appear frequently but inconsistent interactions are down-weighted and missing interactions up-weighted.
View Article and Find Full Text PDFThe extraction of correlated mutations through the method of direct information (DI) provides predicted contact residue pairs that can be used to constrain the three dimensional structures of proteins. We apply this method to a large set of decoy protein folds consisting of many thousand well-constructed models, only tens of which have the correct fold. We find that DI is able to greatly improve the ranking of the true (native) fold but others still remain high scoring that would be difficult to discard due to small shifts in the core beta sheets.
View Article and Find Full Text PDFComplex I is the first and largest enzyme in the respiratory chain and is located in the inner mitochondrial membrane. Complex I deficiency is the most commonly reported mitochondrial disorder presenting in childhood, but the molecular basis of most cases remains elusive. We describe a patient with complex I deficiency caused by mutation of the molecular chaperone FOXRED1.
View Article and Find Full Text PDFIt is well known that the set of observed topological arrangements of secondary structures in globular proteins is highly limited. These limitations have been explained as the consequence of several rules of thumb including a strong preference for right-handed connections, against crossing loops and certain beta strand patterns. We present a critical evaluation of the power of these rules to distinguish known from possible topologies in a large set of two- and three-layer protein structures and determine that although these rules are still largely valid, an increasing number of exceptions can be found to many of them.
View Article and Find Full Text PDFThere has been considerable progress towards the goal of understanding the space of possible tertiary structures adopted by proteins. Despite a greatly increased rate of structure determination and a deliberate strategy of sequencing proteins expected to be very different from those already known, it is now rare to see a genuinely new fold, leading to the conclusion that we have seen the majority of natural structural types. The increase in knowledge has also led to a critical examination of traditional fold-based classifications and their meaning for evolution and protein structures.
View Article and Find Full Text PDFIn recent years, there have been significant advances in the field of computational protein design including the successful computational design of enzymes based on backbone scaffolds from experimentally solved structures. It is likely that large-scale sampling of protein backbone conformations will become necessary as further progress is made on more complicated systems. Removing the constraint of having to use scaffolds based on known protein backbones is a potential method of solving the problem.
View Article and Find Full Text PDFMotivation: Generation of structural models and recognition of homologous relationships for unannotated protein sequences are fundamental problems in bioinformatics. Improving the sensitivity and selectivity of methods designed for these two tasks therefore has downstream benefits for many other bioinformatics applications.
Results: We describe the latest implementation of the GenTHREADER method for structure prediction on a genomic scale.
Coenzyme Q(10) is a mobile lipophilic electron carrier located in the inner mitochondrial membrane. Defects of coenzyme Q(10) biosynthesis represent one of the few treatable mitochondrial diseases. We genotyped a patient with primary coenzyme Q(10) deficiency who presented with neonatal lactic acidosis and later developed multisytem disease including intractable seizures, global developmental delay, hypertrophic cardiomyopathy, and renal tubular dysfunction.
View Article and Find Full Text PDFBMC Struct Biol
March 2009
Background: A great deal is known about the qualitative aspects of the sequence-structure relationship, for example that buried residues are usually more conserved between structurally similar homologues, but no attempts have been made to quantitate the relationship between evolutionary conservation at a sequence position and change to global tertiary structure. In this paper we demonstrate that the Spearman correlation between sequence and structural change is suitable for this purpose.
Results: Buried residues, bends, cysteines, prolines and leucines were significantly more likely to occupy positions highly correlated with structural change than expected by chance.
Alternative premessenger RNA splicing enables genes to generate more than one gene product. Splicing events that occur within protein coding regions have the potential to alter the biological function of the expressed protein and even to create new protein functions. Alternative splicing has been suggested as one explanation for the discrepancy between the number of human genes and functional complexity.
View Article and Find Full Text PDFThe completion of whole genome sequencing projects has provided the genetic instructions of life. However, whereas the identification of gene coding regions has progressed, the mapping of transcriptional regulatory motifs has moved more slowly. To understand how distinct expression profiles can be established and maintained, a greater understanding of these sequences and their trans-acting factors is required.
View Article and Find Full Text PDFMotivation: Previous work had established that it was possible to derive sparse signatures (essentially sequence-length motifs) by examining points of contact between residues in proteins of known three-dimensional (3D) structure. Many interesting protein families have very little tertiary structural information. Methods for deriving signatures using only primary and secondary-structural information were therefore developed.
View Article and Find Full Text PDF