An important role of a particular synonymous codon composition of a gene in its expression level is well known. There are a number of algorithms optimizing codon usage of recombinant genes to maximize their expression in host cells. Nevertheless, the underlying mechanism remains unsolved and is of significant relevance.
View Article and Find Full Text PDFThe ability of protein chains to spontaneously form their three-dimensional structures is a long-standing mystery in molecular biology. The most conceptual aspect of this mystery is how the protein chain can find its native, "working" spatial structure (which, for not too big protein chains, corresponds to the global free energy minimum) in a biologically reasonable time, without exhaustive enumeration of all possible conformations, which would take billions of years. This is the so-called "Levinthal's paradox.
View Article and Find Full Text PDFMotivation: Epistasis, the context-dependence of the contribution of an amino acid substitution to fitness, is common in evolution. To detect epistasis, fitness must be measured for at least four genotypes: the reference genotype, two different single mutants and a double mutant with both of the single mutations. For higher-order epistasis of the order n, fitness has to be measured for all 2n genotypes of an n-dimensional hypercube in genotype space forming a "combinatorially complete dataset".
View Article and Find Full Text PDFMotivation: Computational prediction of the effect of mutations on protein stability is used by researchers in many fields. The utility of the prediction methods is affected by their accuracy and bias. Bias, a systematic shift of the predicted change of stability, has been noted as an issue for several methods, but has not been investigated systematically.
View Article and Find Full Text PDFThe ability of protein chains to spontaneously form their spatial structures is a long-standing puzzle in molecular biology. Experimentally measured folding times of single-domain globular proteins range from microseconds to hours: the difference (10-11 orders of magnitude) is the same as that between the life span of a mosquito and the age of the universe. This review describes physical theories of rates of overcoming the free-energy barrier separating the natively folded (N) and unfolded (U) states of protein chains in both directions: "U-to-N" and "N-to-U".
View Article and Find Full Text PDFFitness landscapes depict how genotypes manifest at the phenotypic level and form the basis of our understanding of many areas of biology, yet their properties remain elusive. Previous studies have analysed specific genes, often using their function as a proxy for fitness, experimentally assessing the effect on function of single mutations and their combinations in a specific sequence or in different sequences. However, systematic high-throughput studies of the local fitness landscape of an entire protein have not yet been reported.
View Article and Find Full Text PDFThe prediction of protein folding rates is a necessary step towards understanding the principles of protein folding. Due to the increasing amount of experimental data, numerous protein folding models and predictors of protein folding rates have been developed in the last decade. The problem has also attracted the attention of scientists from computational fields, which led to the publication of several machine learning-based models to predict the rate of protein folding.
View Article and Find Full Text PDFRegulated intramembrane proteolysis (RIP) is a critical mechanism for intercellular communication and regulates the function of membrane proteins through sequential proteolysis. RIP typically starts with ectodomain shedding of membrane proteins by extracellular membrane-bound proteases followed by intramembrane proteolysis of the resulting membrane-tethered fragment. However, for the majority of RIP proteases the corresponding substrates and thus, their functions, remain unknown.
View Article and Find Full Text PDFExperimentally measured rates of spontaneous folding of single-domain globular proteins range from microseconds to hours: the difference (11 orders of magnitude!) is akin to the difference between the life span of a mosquito and the age of the Universe. We show that physical theory with biological constraints outlines the possible range of folding rates for single-domain globular proteins of various size and stability, and that the experimentally measured folding rates fall within this narrow "golden triangle" built without any adjustable parameters, filling it almost completely. This "golden triangle" also successfully predicts the maximal allowed size of the "foldable" protein domains, as well as the maximal size of protein domains that fold under solely thermodynamic (rather than kinetic) control.
View Article and Find Full Text PDFThe ability of protein chains to spontaneously form their spatial structures is a long-standing puzzle in molecular biology. Experimentally measured rates of spontaneous folding of single-domain globular proteins range from microseconds to hours: the difference (11 orders of magnitude) is akin to the difference between the life span of a mosquito and the age of the universe. Here, we show that physical theory with biological constraints outlines a "golden triangle" limiting the possible range of folding rates for single-domain globular proteins of various size and stability, and that the experimentally measured folding rates fall within this narrow triangle built without any adjustable parameters, filling it almost completely.
View Article and Find Full Text PDFHere we present a systematic analysis of accessible surface areas and hydrogen bonds of 2554 globular proteins from four structural classes (all-α, all-β, α/β and α+β proteins) that is aimed to learn in which structural class the accessible surface area increases with increasing protein molecular mass more rapidly than in other classes, and what structural peculiarities are responsible for this effect. The beta structural class of proteins was found to be the leader, with the following possible explanations of this fact. First, in beta structural proteins, the fraction of residues not included in the regular secondary structure is the largest, and second, the accessible surface area of packaged elements of the beta-structure increases more rapidly with increasing molecular mass in comparison with the alpha-structure.
View Article and Find Full Text PDFIntrinsically disordered regions serve as molecular recognition elements, which play an important role in the control of many cellular processes and signaling pathways. It is useful to be able to predict positions of disordered regions in protein chains. The statistical analysis of disordered residues was done considering 34,464 unique protein chains taken from the PDB database.
View Article and Find Full Text PDFThere are several important questions on the coupling between properties of the protein shape and the rate of protein folding. We have studied a series of structural descriptors intended for describing protein shapes (the radius of gyration, the radius of cross-section, and the coefficient of compactness) and their possible connection with folding behavior, either rates of folding or the emergence of folding intermediates, and compared them with classical descriptors, protein chain length and contact order. It has been found that when a descriptor is normalized to eliminate the influence of the protein size (the radius of gyration normalized to the radius of gyration of a ball of equal volume, the coefficient of compactness defined as the ratio of the accessible surface area of a protein to that of an ideal ball of equal volume, and relative contact order) it completely looses its ability to predict folding rates.
View Article and Find Full Text PDFWe suggest an algorithm that inputs a protein sequence and outputs a decomposition of the protein chain into a regular part including secondary structures and a nonregular part corresponding to loop regions. We have analyzed loop regions in a protein dataset of 3,769 globular domains and defined the optimal parameters for this prediction: the threshold between regular and nonregular regions and the optimal window size for averaging procedures using the scale of the expected number of contacts in a globular state and entropy scale as the number of degrees of freedom for the angles phi, psi, and chi for each amino acid. Comparison with known methods demonstrates that our method gives the same results as the well-known ALB method based on physical properties of amino acids (the percentage of true predictions is 64% against 66%), and worse prediction for regular and nonregular regions than PSIPRED (Protein Structure Prediction Server) without alignment of homologous proteins (the percentage of true predictions is 73%).
View Article and Find Full Text PDFWe propose here KineticDB, a systematically compiled database of protein folding kinetics, which contains about 90 unique proteins. The main goal of the KineticDB is to provide users with a diverse set of protein folding rates determined experimentally. The search for determinants of protein folding is still in progress, aimed at obtaining a new understanding of the folding process.
View Article and Find Full Text PDFWe have demonstrated here that protein compactness, which we define as the ratio of the accessible surface area of a protein to that of the ideal sphere of the same volume, is one of the factors determining the mechanism of protein folding. Proteins with multi-state kinetics, on average, are more compact (compactness is 1.49+/-0.
View Article and Find Full Text PDFWe have demonstrated that, among proteins of the same size, alpha/beta proteins have on the average a greater number of contacts per residue due to their more compact (more "spherical") structure, rather than due to tighter packing. We have examined the relationship between the average number of contacts per residue and folding rates in globular proteins according to general protein structural class (all-alpha, all-beta, alpha/beta, alpha+beta). Our analysis demonstrates that alpha/beta proteins have both the greatest number of contacts and the slowest folding rates in comparison to proteins from the other structural classes.
View Article and Find Full Text PDFArchaea, bacteria and eukaryotes represent the main kingdoms of life. Is there any trend for amino acid compositions of proteins found in full genomes of species of different kingdoms? What is the percentage of totally unstructured proteins in various proteomes? We obtained amino acid frequencies for different taxa using 195 known proteomes and all annotated sequences from the Swiss-Prot data base. Investigation of the two data bases (proteomes and Swiss-Prot) shows that the amino acid compositions of proteins differ substantially for different kingdoms of life, and this difference is larger between different proteomes than between different kingdoms of life.
View Article and Find Full Text PDF