Phylogenetic inference based on protein sequence alignment is a widely used procedure. Numerous phylogenetic algorithms have been developed, most of which have many parameters and options. Choosing a program, options, and parameters can be a nontrivial task.
View Article and Find Full Text PDFSome restriction-modification systems contain two DNA methyltransferases. In the present work, we have classified such systems according to the families of catalytic domains present in the restriction endonucleases and both DNA methyltransferases. Evolution of the restriction-modification systems containing an endonuclease with a NOV_C family domain and two DNA methyltransferases, both with DNA_methylase family domains, was investigated in detail.
View Article and Find Full Text PDFe-mail: sas@belozersky.msu.ru Protein phylogeny is usually reconstructed basing on a multiple alignment of amino acid sequences.
View Article and Find Full Text PDFBackground: Eukaryotic protein-coding genes consist of exons and introns. Exon-intron borders are conserved between species and thus their changes might be observed only on quite long evolutionary distances. One of the rarest types of change, in which intron relocates over a short distance, is called "intron sliding", but the reality of this event has been debated for a long time.
View Article and Find Full Text PDFBackground: Many algorithms and programs are available for phylogenetic reconstruction of families of proteins. Methods used widely at present use either a number of distance-based principles or character-based principles of maximum parsimony or maximum likelihood.
Results: We developed a novel program, named PQ, for reconstructing protein and nucleic acid phylogenies following a new character-based principle.
Palindromes are frequently underrepresented in prokaryotic genomes. Palindromic 5[Formula: see text]-GATC-3[Formula: see text] site is a recognition site of different Restriction-Modification (R-M) systems, as well as solitary methyltransferase Dam. Classical GATC-specific R-M systems methylate GATC and cleave unmethylated GATC.
View Article and Find Full Text PDFBackground: Avoidance of palindromic recognition sites of Type II restriction-modification (R-M) systems was shown for many R-M systems in dozens of prokaryotic genomes. However the phenomenon has not been investigated systematically for all presently available genomes and annotated R-M systems. We have studied all known recognition sites in thousands of prokaryotic genomes and found factors that influence their avoidance.
View Article and Find Full Text PDFThe recent upgrade of nucleic acid-protein interaction database (NPIDB, http://npidb.belozersky.msu.
View Article and Find Full Text PDFSubstitution rates strongly depend on their nucleotide context. One of the most studied examples is the excess of C > T mutations in the CG context in various groups of organisms, including vertebrates. Studies on the molecular mechanisms underlying this mutation regularity have provided insights into evolution, mutagenesis, and cancer development.
View Article and Find Full Text PDFIn general, mutation frequencies are context-dependent: specific adjacent nucleotides may influence the probability to observe a specific type of mutation in a genome. Recently, several hypermutable motifs were identified in the human genome. Namely, there is an increased frequency of T>C mutations in the second position of the words ATTG and ATAG and an increased frequency of A>C mutations in the first position of the word ACAA.
View Article and Find Full Text PDFProkaryotic restriction-modification (R-M) systems defend the host cell from the invasion of a foreign DNA. They comprise two enzymatic activities: specific DNA cleavage activity and DNA methylation activity preventing cleavage. Typically, these activities are provided by two separate enzymes: a DNA methyltransferase (MTase) and a restriction endonuclease (RE).
View Article and Find Full Text PDFBackground: The substitution rates within different nucleotide contexts are subject to varying levels of bias. The most well known example of such bias is the excess of C to T (C > T) mutations in CpG (CG) dinucleotides. The molecular mechanisms underlying this bias are important factors in human genome evolution and cancer development.
View Article and Find Full Text PDFWater molecules immobilized on a protein or DNA surface are known to play an important role in intramolecular and intermolecular interactions. Comparative analysis of related three-dimensional (3D) structures allows to predict the locations of such water molecules on the protein surface. We have developed and implemented the algorithm WLAKE detecting "conserved" water molecules, i.
View Article and Find Full Text PDFExpressed sequence tags (ESTs) represent 500-1000-bp-long sequences corresponding to mRNAs derived from different sources (cell lines, tissues, etc.). The human EST database contains over 8,000,000 sequences, with over 4,000,000,000 total nucleotides.
View Article and Find Full Text PDF