Fitness landscapes depict how genotypes manifest at the phenotypic level and form the basis of our understanding of many areas of biology, yet their properties remain elusive. Previous studies have analysed specific genes, often using their function as a proxy for fitness, experimentally assessing the effect on function of single mutations and their combinations in a specific sequence or in different sequences. However, systematic high-throughput studies of the local fitness landscape of an entire protein have not yet been reported.
View Article and Find Full Text PDFGammaproteobacteria get energy for their growth from different carbon sources using either glycolysis or alternative metabolic pathways induced in stress conditions. These metabolic switches are coordinated by complex interplay of regulatory proteins sensing concentrations of available metabolites by mechanisms yet to be understood. Here, we use two transcriptional regulators, ExuR and UxuR, controlling d-galacturonate (d-gal) and d-glucuronate metabolism in Escherichia coli, as the targets for computational search of low-molecular compounds capable to bind their ligand-binding domains.
View Article and Find Full Text PDFThe nature of factors governing the tempo and mode of protein evolution is a fundamental issue in evolutionary biology. Specifically, whether or not interactions between different sites, or epistasis, are important in directing the course of evolution became one of the central questions. Several recent reports have scrutinized patterns of long-term protein evolution claiming them to be compatible only with an epistatic fitness landscape.
View Article and Find Full Text PDFThe main forces directing long-term molecular evolution remain obscure. A sizable fraction of amino-acid substitutions seem to be fixed by positive selection, but it is unclear to what degree long-term protein evolution is constrained by epistasis, that is, instances when substitutions that are accepted in one genotype are deleterious in another. Here we obtain a quantitative estimate of the prevalence of epistasis in long-term protein evolution by relating data on amino-acid usage in 14 organelle proteins and 2 nuclear-encoded proteins to their rates of short-term evolution.
View Article and Find Full Text PDFBackground: The evolution and genomic stop codon frequencies have not been rigorously studied with the exception of coding of non-canonical amino acids. Here we study the rate of evolution and frequency distribution of stop codons in bacterial genomes.
Results: We show that in bacteria stop codons evolve slower than synonymous sites, suggesting the action of weak negative selection.
Although some data link archaeal and eukaryotic translation, the overall mechanism of protein synthesis in archaea remains largely obscure. Both archaeal (aRF1) and eukaryotic (eRF1) single release factors recognize all three stop codons. The archaeal genus Methanosarcinaceae contains two aRF1 homologs, and also uses the UAG stop to encode the 22nd amino acid, pyrrolysine.
View Article and Find Full Text PDFWe identified a mutation in the CRYGD gene (P23S) of the gamma-crystallin gene cluster that is associated with a polymorphic congenital cataract that occurs with frequency of approximately 0.3% in a human population. To gain insight into the molecular mechanism of the pathogenesis of gamma-crystallin isoforms, we undertook an evolutionary analysis of the available mammalian and newly obtained primate sequences of the gamma-crystallin genes.
View Article and Find Full Text PDFHaving an extra copy of a gene is thought to provide some functional redundancy, which results in a higher rate of evolution in duplicated genes. In this article, we estimate the impact of gene duplication on the selection of tuf paralogs, and we find that in the absence of gene conversion, tuf paralogs have evolved significantly slower than when gene conversion has been a factor in their evolution. Thus, tuf gene copies evolve under a selective pressure that ensures their functional uniformity, and gene conversion reduces selection against amino acid substitutions that affect the function of the encoded protein, EF-Tu.
View Article and Find Full Text PDFWe describe a new method for polyproline II-type (PPII) secondary structure prediction based on tetrapeptide conformation properties using data obtained from all globular proteins in the Protein Data Bank (PDB). This is the first method for PPII prediction with a relatively high level of accuracy (approximately 60%). Our method uses only frequencies of different conformations among oligopeptides without any additional parameters.
View Article and Find Full Text PDFWe have constructed a very large virtual diversity space containing more than 10(13) chemical compounds. The diversity space is built from about 400 combinatorial libraries, which have been expanded by choosing sizeable collections of suitable R-groups that can be attached to each link point of their scaffolds. These R-group collections have been created by selecting reagents that have drug-like properties from catalogs of available chemicals.
View Article and Find Full Text PDFAlignment of protein sequences is a key step in most computational methods for prediction of protein function and homology-based modeling of three-dimensional (3D)-structure. We investigated correspondence between "gold standard" alignments of 3D protein structures and the sequence alignments produced by the Smith-Waterman algorithm, currently the most sensitive method for pair-wise alignment of sequences. The results of this analysis enabled development of a novel method to align a pair of protein sequences.
View Article and Find Full Text PDF