We propose a random partition distribution indexed by pairwise similarity information such that partitions compatible with the similarities are given more probability. The use of pairwise similarities, in the form of distances, is common in some clustering algorithms (e.g.
View Article and Find Full Text PDFThe methylotrophic yeast Pichia pastoris has been used extensively for expressing recombinant proteins because it combines the ease of genetic manipulation, the ability to provide complex posttranslational modifications and the capacity for efficient protein secretion. The most successful and commonly used secretion signal leader in Pichia pastoris has been the alpha mating factor (MATα) prepro secretion signal. However, limitations exist as some proteins cannot be secreted efficiently, leading to strategies to enhance secretion efficiency by modifying the secretion signal leader.
View Article and Find Full Text PDFMotivation: By simplifying the many-bodied complexity of residue packing into patterns of simple pairwise secondary structure interactions between a single knob residue with a three-residue socket, the knob-socket construct allows a more direct incorporation of structural information into the prediction of residue contacts. By modeling the preferences between the amino acid composition of a socket and knob, we undertake an investigation of the knob-socket construct's ability to improve the prediction of residue contacts. The statistical model considers three priors and two posterior estimations to better understand how the input data affects predictions.
View Article and Find Full Text PDFDetermining the primary structure (i.e., amino acid sequence) of a protein has become cheaper, faster, and more accurate.
View Article and Find Full Text PDFThe methylotrophic yeast, Pichia pastoris, has been genetically engineered to produce many heterologous proteins for industrial and research purposes. In order to secrete proteins for easier purification from the extracellular medium, the coding sequence of recombinant proteins is initially fused to the Saccharomyces cerevisiae α-mating factor secretion signal leader. Extensive site-directed mutagenesis of the prepro-region of the α-mating factor secretion signal sequence was performed in order to determine the effects of various deletions and substitutions on expression.
View Article and Find Full Text PDFAs an alternative to the common template based protein structure prediction methods based on main-chain position, a novel side-chain centric approach has been developed. Together with a Bayesian loop modeling procedure and a combination scoring function, the Stone Soup algorithm was applied to the CASP9 set of template based modeling targets. Although the method did not generate as large of perturbations to the template structures as necessary, the analysis of the results gives unique insights into the differences in packing between the target structures and their templates.
View Article and Find Full Text PDFMotivation: While protein secondary structure is well understood, representing the repetitive nature of tertiary packing in proteins remains difficult. We have developed a construct called the relative packing group (RPG) that applies the clique concept from graph theory as a natural basis for defining the packing motifs in proteins. An RPG is defined as a clique of residues, where every member contacts all others as determined by the Delaunay tessellation.
View Article and Find Full Text PDFBy providing new insights into the distribution of a protein's torsion angles, recent statistical models for this data have pointed the way to more efficient methods for protein structure prediction. Most current approaches have concentrated on bivariate models at a single sequence position. There is, however, considerable value in simultaneously modeling angle pairs at multiple sequence positions in a protein.
View Article and Find Full Text PDFInterest in predicting protein backbone conformational angles has prompted the development of modeling and inference procedures for bivariate angular distributions. We present a Bayesian approach to density estimation for bivariate angular data that uses a Dirichlet process mixture model and a bivariate von Mises distribution. We derive the necessary full conditional distributions to fit the model, as well as the details for sampling from the posterior predictive distribution.
View Article and Find Full Text PDFWe examine the contribution of residues at the dimer interface of the transcriptional regulator OxyR to oligomerization. Residues in contact across the dimer interface of OxyR were identified using the program Quaternary Contacts (QContacts). Site-directed mutagenesis was performed on the non-alanine or glycine residues identified in the resultant contact profile and the oligomerization ability of the mutant proteins was tested using the lambdacI repressor system to identify residues that are hot spots in OxyR.
View Article and Find Full Text PDFProtein structure prediction has a number of important ad hoc similarity measures for evaluating predictions, but would benefit from a measure that is able to provide a common framework for a broad range of comparisons. Here we show that a mutual information-like measure can provide a comprehensive framework for evaluating protein structure prediction of all types. We discuss the concept of information, its application to secondary structure, and the obstacle to applying it to 3D structure.
View Article and Find Full Text PDFWe describe an information-theory-based measure of the quality of secondary structure prediction (RELINFO). RELINFO has a simple yet intuitive interpretation: it represents the factor by which secondary structure choice at a residue has been restricted by a prediction scheme. As an alternative interpretation of secondary structure prediction, RELINFO complements currently used methods by providing an information-based view as to why a prediction succeeds and fails.
View Article and Find Full Text PDF