Motivation: Structure-based stability prediction upon mutation is crucial for protein engineering and design, and for understanding genetic diseases or drug resistance events. For this task, we adopted a simple residue-based orientational potential that considers only three backbone atoms, previously applied in protein modeling. Its application to stability prediction only requires parametrizing 12 amino acid-dependent weights using cross-validation strategies on a curated dataset in which we tried to reduce the mutations that belong to protein-protein or protein-ligand interfaces, extreme conditions and the alanine over-representation.
View Article and Find Full Text PDFWe propose and validate a novel method to efficiently explore local protein loop conformations based on a new formalism for constrained normal mode analysis (NMA) in internal coordinates. The manifold of possible loop configurations imposed by the position and orientation of the fixed loop ends is reduced to an orthogonal set of motions (or modes) encoding concerted rotations of all the backbone dihedral angles. We validate the sampling power on a set of protein loops with highly variable experimental structures and demonstrate that our approach can efficiently explore the conformational space of closed loops.
View Article and Find Full Text PDFLow-frequency normal modes generated by elastic network models tend to correlate strongly with large conformational changes of proteins, despite their reliance on the harmonic approximation, which is only valid in close proximity of the native structure. We consider 12 variants of the torsional network model (TNM), an elastic network model in torsion angle space, that adopt different sets of torsion angles as degrees of freedom and reproduce with similar quality the thermal fluctuations of proteins but present drastic differences in their agreement with conformational changes. We show that these differences are related to the extent of the deviations from the harmonic approximation, assessed through an anharmonic energy function whose harmonic approximation coincides with the TNM.
View Article and Find Full Text PDFTorsion angles are the natural degrees of freedom of protein structures. The ability to determine torsional variations corresponding to observed changes in Cartesian coordinates is highly valuable, notably to investigate the mechanisms of functional conformational changes or to develop computational models of protein dynamics. This issue is far from trivial in practice since the impact of modifying one torsion angle strongly depends on all other angles, and the compounding effects of small variations in bond lengths and valence angles can completely disrupt a protein fold.
View Article and Find Full Text PDFTikhonov regularization, or ridge regression, is a popular technique to deal with collinearity in multivariate regression. We unveil a formal analogy between ridge regression and statistical mechanics, where the objective function is comparable to a free energy, and the ridge parameter plays the role of temperature. This analogy suggests two novel criteria for selecting a suitable ridge parameter: specific-heat (C) and maximum penalty (MP).
View Article and Find Full Text PDFThe selective pressures acting on viruses that replicate under enhanced mutation rates are largely unknown. Here, we describe resistance of foot-and-mouth disease virus to the mutagen 5-fluorouracil (FU) through a single polymerase substitution that prevents an excess of A to G and U to C transitions evoked by FU on the wild-type foot-and-mouth disease virus, while maintaining the same level of mutant spectrum complexity. The polymerase substitution inflicts upon the virus a fitness loss during replication in absence of FU but confers a fitness gain in presence of FU.
View Article and Find Full Text PDFCurr Opin Struct Biol
February 2017
The ability to rationally modify targeted physical and biological features of a protein of interest holds promise in numerous academic and industrial applications and paves the way towards de novo protein design. In particular, bioprocesses that utilize the remarkable properties of enzymes would often benefit from mutants that remain active at temperatures that are either higher or lower than the physiological temperature, while maintaining the biological activity. Many in silico methods have been developed in recent years for predicting the thermodynamic stability of mutant proteins, but very few have focused on thermostability.
View Article and Find Full Text PDFThe proper biological functioning of proteins often relies on the occurrence of coordinated fluctuations around their native structure, or on their ability to perform wider and sometimes highly elaborated motions. Hence, there is considerable interest in the definition of accurate coarse-grained descriptions of protein dynamics, as an alternative to more computationally expensive approaches. In particular, the elastic network model, in which residue motions are subjected to pairwise harmonic potentials, is known to capture essential aspects of conformational dynamics in proteins, but has so far remained mostly phenomenological, and unable to account for the chemical specificities of amino acids.
View Article and Find Full Text PDFCommunity-wide blind prediction experiments such as CAPRI and CASP provide an objective measure of the current state of predictive methodology. Here we describe a community-wide assessment of methods to predict the effects of mutations on protein-protein interactions. Twenty-two groups predicted the effects of comprehensive saturation mutagenesis for two designed influenza hemagglutinin binders and the results were compared with experimental yeast display enrichment data obtained using deep sequencing.
View Article and Find Full Text PDFThe ability of proteins to establish highly selective interactions with a variety of (macro)molecular partners is a crucial prerequisite to the realization of their biological functions. The availability of computational tools to evaluate the impact of mutations on protein-protein binding can therefore be valuable in a wide range of industrial and biomedical applications, and help rationalize the consequences of non-synonymous single-nucleotide polymorphisms. BeAtMuSiC (http://babylone.
View Article and Find Full Text PDFThe ability to rapidly and accurately predict the effects of mutations on the physicochemical properties of proteins holds tremendous importance in the rational design of modified proteins for various types of industrial, environmental or pharmaceutical applications, as well as in elucidating the genetic background of complex diseases. In many cases, the absence of an experimentally resolved structure represents a major obstacle, since most currently available predictive software crucially depend on it. We investigate here the relevance of combining coarse-grained structure-based stability predictions with a simple comparative modeling procedure.
View Article and Find Full Text PDFAvailable DNA microarray time series that record gene expression along the developmental stages of multicellular eukaryotes, or in unicellular organisms subject to external perturbations such as stress and diauxie, are analyzed. By pairwise comparison of the gene expression profiles on the basis of a translation-invariant and scale-invariant distance measure corresponding to least-rectangle regression, it is shown that peaks in the average distance values are noticeable and are localized around specific time points. These points systematically coincide with the transition points between developmental phases or just follow the external perturbations.
View Article and Find Full Text PDFMotivation: Accurate prediction of protein stability is important for understanding the molecular underpinnings of diseases and for the design of new proteins. We introduce a novel approach for the prediction of changes in protein stability that arise from a single-site amino acid substitution; the approach uses available data on mutations occurring in the same position and in other positions. Our algorithm, named Pro-Maya (Protein Mutant stAbilitY Analyzer), combines a collaborative filtering baseline model, Random Forests regression and a diverse set of features.
View Article and Find Full Text PDFSpinocerebellar Ataxia Type 3 (SCA3) is one of nine polyglutamine (polyQ) diseases that are all characterized by progressive neuronal dysfunction and the presence of neuronal inclusions containing aggregated polyQ protein, suggesting that protein misfolding is a key part of this disease. Ataxin-3, the causative protein of SCA3, contains a globular, structured N-terminal domain (the Josephin domain) and a flexible polyQ-containing C-terminal tail, the repeat-length of which modulates pathogenicity. It has been suggested that the fibrillogenesis pathway of ataxin-3 begins with a non-polyQ-dependent step mediated by Josephin domain interactions, followed by a polyQ-dependent step.
View Article and Find Full Text PDFBackground: The rational design of modified proteins with controlled stability is of extreme importance in a whole range of applications, notably in the biotechnological and environmental areas, where proteins are used for their catalytic or other functional activities. Future breakthroughs in medical research may also be expected from an improved understanding of the effect of naturally occurring disease-causing mutations on the molecular level.
Results: PoPMuSiC-2.
The VPAC(1) receptor belongs to family B of G protein-coupled receptors (GPCR-B) and is activated upon binding of the vasoactive intestinal peptide (VIP). Despite the recent determination of the structure of the N terminus of several members of this receptor family, little is known about the structure of the transmembrane (TM) region and about the molecular mechanisms leading to activation. In the present study, we designed a new structural model of the TM domain and combined it with experimental mutagenesis experiments to investigate the interaction network that governs ligand binding and receptor activation.
View Article and Find Full Text PDFThe goal of controlling protein thermostability is tackled here through establishing, by in silico analyses, the relative weight of residue-residue interactions in proteins as a function of temperature. We have designed for that purpose a (melting-) temperature-dependent, statistical distance potential, where the interresidue distances are computed between the side-chain geometric centers or their functional centers. Their separate derivation from proteins of either high or low thermal resistance reveals the interactions that contribute most to stability in different temperature ranges.
View Article and Find Full Text PDFMotivation: The rational design of proteins with modified properties, through amino acid substitutions, is of crucial importance in a large variety of applications. Given the huge number of possible substitutions, every protein engineering project would benefit strongly from the guidance of in silico methods able to predict rapidly, and with reasonable accuracy, the stability changes resulting from all possible mutations in a protein.
Results: We exploit newly developed statistical potentials, based on a formalism that highlights the coupling between four protein sequence and structure descriptors, and take into account the amino acid volume variation upon mutation.
The time evolution of gene expression across the developmental stages of the host organism can be inferred from appropriate DNA microarray time series. Modeling this evolution aims eventually at improving the understanding and prediction of the complex phenomena that are the basis of life. We focus on the embryonic-to-adult development phases of Drosophila melanogaster, and chose to model the expression network with the help of a system of differential equations with constant coefficients, which are nonlinear in the transcript concentrations but linear in their logarithms.
View Article and Find Full Text PDFThe possibility to rationally design protein mutants that remain structured and active at high temperatures strongly depends on a better understanding of the mechanisms of protein thermostability. Studies devoted to this issue often rely on the living temperature (T(env)) of the host organism rather than on the melting temperature (T(m)) of the analyzed protein. To investigate the scale of this approximation, we probed the relationship between T(m) and T(env) on a dataset of 127 proteins, and found a much weaker correlation than previously expected: the correlation coefficient is equal to 0.
View Article and Find Full Text PDFThe temperature dependence of the interactions that stabilize protein structures is a long-standing issue, the elucidation of which would enable the prediction and the rational modification of the thermostability of a target protein. It is tackled here by deriving distance-dependent amino acid pair potentials from four datasets of proteins with increasing melting temperatures (Tm). The temperature dependence of the interactions is determined from the differences in the shape of the potentials derived from the four datasets.
View Article and Find Full Text PDFThe ability to rationally increase the stability and solubility of recombinant proteins has long been a goal of biotechnology and has significant implications for biomedical research. Poorly soluble enzymes, for example, result in the need for larger reaction volumes, longer incubation times, and more restricted reaction conditions, all of which increase the cost and have a negative impact on the feasibility of the process. Rational design is achieved here by means of the PoPMuSiC program, which performs in silico predictions of stability changes upon single-site mutations.
View Article and Find Full Text PDFNovel statistical potentials derived from known protein structures are presented. They are designed to describe cation-pi and amino-pi interactions between a positively charged amino acid or an amino acid carrying a partially charged amino group and an aromatic moiety. These potentials are based on the propensity of residue types to be separated by a certain spatial distance or to have a given relative orientation.
View Article and Find Full Text PDFKnowledge-based potentials are widely used in simulations of protein folding, structure prediction, and protein design. Their advantages include limited computational requirements and the ability to deal with low-resolution protein models compatible with long-scale simulations. Their drawbacks comprehend their dependence on specific features of the dataset from which they are derived, such as the size of the proteins it contains, and their physical meaning is still a subject of debate.
View Article and Find Full Text PDF