Degradation of nuclear proteins by the 26S proteasome is essential for cell viability. In yeast, the nuclear envelope protein Cut8 mediates nuclear proteasomal sequestration by an uncharacterized mechanism. Here we describe structures of Schizosaccharomyces pombe Cut8, which shows that it contains a unique, modular fold composed of an extended N-terminal, lysine-rich segment that when ubiquitinated binds the proteasome, a dimer domain followed by a six-helix bundle connected to a flexible C tail.
View Article and Find Full Text PDFProtein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6-7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available.
View Article and Find Full Text PDFThe arms race between cellular life forms and viruses is a major driving force of evolution. A substantial fraction of bacterial and archaeal genomes is dedicated to antivirus defense. We analyzed the distribution of defense genes and typical mobilome components (such as viral and transposon genes) in bacterial and archaeal genomes and demonstrated statistically significant clustering of antivirus defense systems and mobile genes and elements in genomic islands.
View Article and Find Full Text PDFResearch in quantitative evolutionary genomics and systems biology led to the discovery of several universal regularities connecting genomic and molecular phenomic variables. These universals include the log-normal distribution of the evolutionary rates of orthologous genes; the power law-like distributions of paralogous family size and node degree in various biological networks; the negative correlation between a gene's sequence evolution rate and expression level; and differential scaling of functional classes of genes with genome size. The universals of genome evolution can be accounted for by simple mathematical models similar to those used in statistical physics, such as the birth-death-innovation model.
View Article and Find Full Text PDFPlanctomycetes, Verrucomicrobia and Chlamydia are prokaryotic phyla, sometimes grouped together as the PVC superphylum of eubacteria. Some PVC species possess interesting attributes, in particular, internal membranes that superficially resemble eukaryotic endomembranes. Some biologists now claim that PVC bacteria are nucleus-bearing prokaryotes and are considered evolutionary intermediates in the transition from prokaryote to eukaryote.
View Article and Find Full Text PDFBackground: The CRISPR-Cas adaptive immunity systems that are present in most Archaea and many Bacteria function by incorporating fragments of alien genomes into specific genomic loci, transcribing the inserts and using the transcripts as guide RNAs to destroy the genome of the cognate virus or plasmid. This RNA interference-like immune response is mediated by numerous, diverse and rapidly evolving Cas (CRISPR-associated) proteins, several of which form the Cascade complex involved in the processing of CRISPR transcripts and cleavage of the target DNA. Comparative analysis of the Cas protein sequences and structures led to the classification of the CRISPR-Cas systems into three Types (I, II and III).
View Article and Find Full Text PDFThe F16L gene of vaccinia virus (VACV) is conserved in all chordopoxviruses except avipoxviruses. The crocodile poxvirus F16 protein ortholog has highly significant similarity to prokaryotic serine recombinases and contains all amino acids that comprise the catalytic site. In contrast, F16 orthologs encoded by other poxviruses show only marginally significant similarity to serine recombinases, lack essential amino acids of the active site and are most likely inactive derivatives of serine recombinases.
View Article and Find Full Text PDFWe examine the Tree of Life (TOL) as an evolutionary hypothesis and a heuristic. The original TOL hypothesis has failed but a new "statistical TOL hypothesis" is promising. The TOL heuristic usefully organizes data without positing fundamental evolutionary truth.
View Article and Find Full Text PDFThe genome sequence of the Mamavirus, a new Acanthamoeba polyphaga mimivirus strain, is reported. With 1,191,693 nt in length and 1,023 predicted protein-coding genes, the Mamavirus has the largest genome among the known viruses. The genomes of the Mamavirus and the previously described Mimivirus are highly similar in both the protein-coding genes and the intergenic regions.
View Article and Find Full Text PDFAccurate inference of orthologous genes is a pre-requisite for most comparative genomics studies, and is also important for functional annotation of new genomes. Identification of orthologous gene sets typically involves phylogenetic tree analysis, heuristic algorithms based on sequence conservation, synteny analysis, or some combination of these approaches. The most direct tree-based methods typically rely on the comparison of an individual gene tree with a species tree.
View Article and Find Full Text PDFBackground: Accurate estimation of the divergence time of the extant eukaryotes is a fundamentally important but extremely difficult problem owing primarily to gross violations of the molecular clock at long evolutionary distances and the lack of appropriate calibration points close to the date of interest. These difficulties are intrinsic to the dating of ancient divergence events and are reflected in the large discrepancies between estimates obtained with different approaches. Estimates of the age of Last Eukaryotic Common Ancestor (LECA) vary approximately twofold, from ~1,100 million years ago (Mya) to ~2,300 Mya.
View Article and Find Full Text PDFMTH1203, a β-CASP metallo-β-lactamase family nuclease from the archaeon Methanothermobacter thermautotrophicus, was identified as a putative nuclease that might contribute to RNA processing. The crystal structure of MTH1203 reveals that, in addition to the metallo-β-lactamase nuclease and the β-CASP domains, it contains two contiguous KH domains that are unique to MTH1203 and its orthologs. RNA-binding experiments indicate that MTH1203 preferentially binds U-rich sequences with a dissociation constant in the micromolar range.
View Article and Find Full Text PDFThe CRISPR-Cas (clustered regularly interspaced short palindromic repeats-CRISPR-associated proteins) modules are adaptive immunity systems that are present in many archaea and bacteria. These defence systems are encoded by operons that have an extraordinarily diverse architecture and a high rate of evolution for both the cas genes and the unique spacer content. Here, we provide an updated analysis of the evolutionary relationships between CRISPR-Cas systems and Cas proteins.
View Article and Find Full Text PDFThe widespread exchange of genes among prokaryotes, known as horizontal gene transfer (HGT), is often considered to "uproot" the Tree of Life (TOL). Indeed, it is by now fully clear that genes in general possess different evolutionary histories. However, the possibility remains that the TOL concept can be reformulated and remain valid as a statistical central trend in the phylogenetic "Forest of Life" (FOL).
View Article and Find Full Text PDFThe division of labor between template and catalyst is a fundamental property of all living systems: DNA stores genetic information whereas proteins function as catalysts. The RNA world hypothesis, however, posits that, at the earlier stages of evolution, RNA acted as both template and catalyst. Why would such division of labor evolve in the RNA world? We investigated the evolution of DNA-like molecules, i.
View Article and Find Full Text PDFWe describe the draft genome of the microcrustacean Daphnia pulex, which is only 200 megabases and contains at least 30,907 genes. The high gene count is a consequence of an elevated rate of gene duplication resulting in tandem gene clusters. More than a third of Daphnia's genes have no detectable homologs in any other available proteome, and the most amplified gene families are specific to the Daphnia lineage.
View Article and Find Full Text PDFPlants possess two myosin classes, VIII and XI. The myosins XI are implicated in organelle transport, filamentous actin organization, and cell and plant growth. Due to the large size of myosin gene families, knowledge of these molecular motors remains patchy.
View Article and Find Full Text PDFClustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) and the associated proteins (Cas) comprise a system of adaptive immunity against viruses and plasmids in prokaryotes. Cas1 is a CRISPR-associated protein that is common to all CRISPR-containing prokaryotes but its function remains obscure. Here we show that the purified Cas1 protein of Escherichia coli (YgbT) exhibits nuclease activity against single-stranded and branched DNAs including Holliday junctions, replication forks and 5'-flaps.
View Article and Find Full Text PDFThe highly conserved Kinase, Endopeptidase and Other Proteins of small Size (KEOPS)/Endopeptidase-like and Kinase associated to transcribed Chromatin (EKC) protein complex has been implicated in transcription, telomere maintenance and chromosome segregation, but its exact function remains unknown. The complex consists of five proteins, Kinase-Associated Endopeptidase (Kae1), a highly conserved protein present in bacteria, archaea and eukaryotes, a kinase (Bud32) and three additional small polypeptides. We showed that the complex is required for a universal tRNA modification, threonyl carbamoyl adenosine (t6A), found in all tRNAs that pair with ANN codons in mRNA.
View Article and Find Full Text PDFSeverity of seasonal influenza A epidemics is related to the antigenic novelty of the predominant viral strains circulating each year. Support for a strong correlation between epidemic severity and antigenic drift comes from infectious challenge experiments on vaccinated animals and human volunteers, field studies of vaccine efficacy, prospective studies of subjects with laboratory-confirmed prior infections, and analysis of the connection between drift and severity from surveillance data. We show that, given data on the antigenic and sequence novelty of the hemagglutinin protein of clinical isolates of H3N2 virus from a season along with the corresponding data from prior seasons, we can accurately predict the influenza severity for that season.
View Article and Find Full Text PDFBackground: It is common belief that all cellular life forms on earth have a common origin. This view is supported by the universality of the genetic code and the universal conservation of multiple genes, particularly those that encode key components of the translation system. A remarkable recent study claims to provide a formal, homology independent test of the Universal Common Ancestry hypothesis by comparing the ability of a common-ancestry model and a multiple-ancestry model to predict sequences of universally conserved proteins.
View Article and Find Full Text PDFRegulation of gene expression during infection of the thermophilic bacterium Thermus thermophilus HB8 with the bacteriophage P23-45 was investigated. Macroarray analysis revealed host transcription shut-off and identified three temporal classes of phage genes; early, middle and late. Primer extension experiments revealed that the 5' ends of P23-45 early transcripts are preceded by a common sequence motif that likely defines early viral promoters.
View Article and Find Full Text PDFThe first congress on Viruses of Microbes took place at the Institut Pasteur in Paris, France, on 21-25 June 2010. The advances in genomics and metagenomics reported at this meeting reveal striking and unexpected complexity of the virus world. Viruses, in particular viruses that infect prokaryotes and unicellular eukaryotes, are emerging as the most abundant class of biological entities on earth and a major evolutionary and geochemical force.
View Article and Find Full Text PDFActa Crystallogr Sect F Struct Biol Cryst Commun
October 2010
New distinct versions of known protein folds provide a powerful means of protein-function prediction that complements sequence and genomic context analysis. These structures do not supplant direct biochemical experiments, but are indispensable for the complete characterization of proteins.
View Article and Find Full Text PDF