Ribonucleic acids (RNAs) are one of the most important types of biopolymers. RNAs play key roles in the storage and multiplication of genetic information. They are important in catalysis and RNA splicing and are the most important steps of translation.
View Article and Find Full Text PDFOver the past years, evidence has been accumulating for a fundamental role of protein-protein interactions between transcription factors in gene-specific transcription regulation. Many of these interactions run within composite elements containing binding sites for several factors. We have selected 101 composite regulatory elements identified experimentally in the regulatory regions of 64 genes of vertebrates and of their viruses and briefly described them in a compilation.
View Article and Find Full Text PDFComput Appl Biosci
October 1995
A method for identification of eukaryotic promoters by localization of binding sites for transcription factors has been suggested. The binding sites for a range of transcription factors have been found to be distributed unevenly. Based on these distributions, we have constructed a weight matrix of binding site localization.
View Article and Find Full Text PDFProc Int Conf Intell Syst Mol Biol
December 1995
A computer tool has been developed for revealing sets of oligonucleotides invariant for isofunctional families of DNA (RNA) and for using these in functional identification of nucleotide sequences. The tool allows one to: build up vocabularies of invariant oligonucleotides for the families of isofunctional nucleotide sequences; assess significance of the vocabularies; identify nucleotide sequences with the vocabularies of invariant oligonucleotides; determine the most effective identification parameters to minimize first and second type errors; assess the efficiency of identification of individual isofunctional families with the oligonucleotide vocabularies; determine the evolutionary characteristics of the families of isofunctional sequences on which vocabulary volume depends. Based on the system mentioned, we have analyzed a total of 322 protein-encoding gene families and have built up sets of invariant oligonucleotides, or again, oligonucleotide vocabularies that are characteristic of gene families and subfamilies.
View Article and Find Full Text PDFProc Int Conf Intell Syst Mol Biol
December 1995
We present the computer tool FUNSITE for description and analysis of regulatory sequences of eukaryotic genomes. The tool consists of the following main parts: 1) An integrated database for genomic regulatory sequences. The integrated database was designed on the basis of the databases TRANSFAC (Wingender 1994) and TRRD (Kel et al.
View Article and Find Full Text PDFUsing a set of sequences of 63 cleavage/polyadenylation sites of vertebrate pre-mRNA, a generalized consensus matrix was constructed. The elements of the matrix were the absolute frequencies of oligonucleotides of length l at the ith position of sites. The cleavage point of each site was assigned the same position number.
View Article and Find Full Text PDFA method has been developed to detect pairs of positions with correlated mutations in protein multiple sequence alignments. The method is based on reconstruction of the phylogenetic tree for a set of sequences and statistical analysis of the distribution of mutations in the branches of the tree. The database of homology-derived protein structures (HSSP) is used as the source of multiple sequence alignments for proteins of known three-dimensional structure.
View Article and Find Full Text PDFWe developed the computer system SITEVIDEO for analysis and recognition of the functional sites in DNA and RNA molecules. It reveals contextual features essential for site function and thus enable the user to design efficient methods for recognition of the functional sites. We mainly considered only quantitative characteristics reflecting the uneven distribution of oligonucleotides in the sequences of functional sites of interest.
View Article and Find Full Text PDFWe present a new pairwise alignment algorithm that uses iterative statistical analysis of homologous subsequences. Apart from the classical conversion of the DOT-matrix characteristic of the Needleman-Wunsch algorithm (NW), we used only those matrix elements that corresponded to the most non-random subsequence homologies. The most reliable elements of the DOT-matrix are written to the compact competition matrices.
View Article and Find Full Text PDFBiochim Biophys Acta
November 1992
A new approach for the analysis of hotspots of mutations is described. It is based on the classification of hotspot site sequences. Using this approach, the consensuses RGYW and TAA of hotspot sites were revealed in the V gene.
View Article and Find Full Text PDFThis paper is devoted to a computer system designed for formal description and knowledge about protein structural-functional determinants. Structural-functional determinant is a region of amino acid sequence with certain structural, functional or evolutionary properties specific for each class of macromolecules. Here the problem is solved by a special method which allows to design programs able to distinguish these structural-functional determinants.
View Article and Find Full Text PDFBased on the analysis of some immunoglobulin V-gene sequences, somatic mutations are assumed to occur by correction of complementary violations in heteroduplexes formed by direct or inverted repeats. Correlation between somatic mutations and repeats is investigated by a statistical weights method in 323 somatic substitutions in 14 V-genes. Assuming absence of correlation, the probability of observing data in the sample would be very low (0.
View Article and Find Full Text PDFMechanisms of spontaneous and chemically induced point mutations' emergence in DNA have theoretically been investigated using the statistical weight method. We have analysed 12 nucleotide sequences containing 95 point spontaneous mutations and 3 sequences comprising 30 mutations induced by such mutagens as 4'-hydroxymethyl-4,5',8-threemethylpsoralene, natrium bisulfite, hydroxylamine. The possibility of occurrence of point mutations by repair correction of heteroduplexes formed via mispairing of imperfect direct and inverted repeats in DNA has been studied.
View Article and Find Full Text PDFA mathematical model of evolutionary dynamics of Alu repeats' number in the human genome has been worked out. The model permitted us to observe the dynamics of propagation of Alu repeats within the genome and to evaluate such important parameters of the process mentioned as the rates of transposition (insertion of new copies into the genome) and excision of repeats. The peculiarities of the control of Alu repeats' number in the genome have been discussed, based on the data obtained.
View Article and Find Full Text PDFDetailed phylogenetic analysis of the gene family of hemagglutinin H3 of influenza A-type virus was fulfilled, taking into account the domain structure of protein and positions of antigen determinants. The densities of distribution of fixed synonimic replacements between domains HA1 and HA2 were shown to be actually equal (rho (HA1) = rho (HA2], and those of nonsynonimic ones to be unequal: their ratios were rho (HA1): rho (HA2) = 2.8 for nonepidemic branches, and rho (HA1): rho (HA2) = 7.
View Article and Find Full Text PDFThe phylogenetic trees of influenza virus genes of hemagglutinins, neuraminidases, and of NS genes were composed. Considering properties of synonimic replacements to be neutral and their rates constant at each tree, the dates of ancestor branch points were calculated, and the rates of fixation of synonimic (Ks) and non-synonimic (Kns) replacements estimated. The epidemic branches were mostly shown to be "deadlocks", non-epidemic ones being internal or "roots.
View Article and Find Full Text PDFThe contextual analysis of the primary structures of 28 various dispersed repeats Alu, B1, B2 and L1 (Kpn1) of primates and rodent genomes has been carried out to search for the regions complementary to the 3'-ends of the known at present tRNA of these organisms. It has been shown that the abundance of the investigated repeats contain the regions revealing marked and statistically non-random complementarity to the 3'-end of at least one of the tRNA considered. Taking into account the retroviruses replication model known from literature these revealed regions are considered as potential primer-binding sites for the initiation of reverse transcription of the mentioned repeats or more long DNA fragments containing them.
View Article and Find Full Text PDFBased on the analysis of a number of immunoglobulin genes' nucleotide sequences, it has been suggested, that somatic mutations emerge by means of imperfect duplexes correction, formed by mispairing of complementary regions of direct and inverted repeats. In the present work provides new data, confirming this mechanism of somatic hypermutagenesis. It has been shown that the presented sample of V- and J-segments of immunoglobulin genes is abundant in nonrandom imperfect direct repeats and complementary palindromes.
View Article and Find Full Text PDFIn the present work a theoretical analysis of the molecular mechanisms on duplications emergence in the genomes of prokaryotes on the basis of direct repeats has been carried out. The correlations obtained have shown, that the duplication rate depends on such parameters as the distance between repeated regions, repeats nucleotide composition and the number of homology damages in them. It has been revealed that the rate of duplications decreases more readily than the deletion rate upon the growth of the distance between the repeats.
View Article and Find Full Text PDFThe contextual analysis of nucleotide sequences of 22 Alu repeats arrangement regions in the human genome has been carried out and some of their peculiarities have been revealed. In particular, the occurrence of marked and statistical non-random homology between the repeats and the regions of their integration has been shown. A mechanism of choosing the Alu repeats insertion regions in the genome has been suggested taking into account these peculiarities.
View Article and Find Full Text PDFIn the present work a mechanism of deletions emergence on the basis of complementary DNA regions mispairing of direct repeats has been investigated theoretically. A quantitative dependence of the rates of deletions emergence on such parameters of the flanking repeats as the nucleotide composition of repeats, the number of homology damages and the distance between repeated regions has been constructed. It has been proved, that using this relationship one can reliably evaluate the total rates of deletions emergence in the lacI gene sequence of E.
View Article and Find Full Text PDFIn the present work a computer analysis of deletion localization in the sequence of the E. coli lacI gene has been carried out by the statistical weight method. Reliable statistical correlation of the deletions location sites with the arrangement of the most perfect direct repeats revealing the shortest distance between repeated fragments has been shown.
View Article and Find Full Text PDF