Publications by authors named "Yulia Suvorova"

We have developed a de novo method for the identification of dispersed repeats based on the use of random position-weight matrices (PWMs) and an iterative procedure (IP). The created algorithm (IP method) allows detection of dispersed repeats for which the average number of substitutions between any two repeats per nucleotide () is less than or equal to 1.5.

View Article and Find Full Text PDF
Article Synopsis
  • Transposable elements (TEs), specifically Short Interspersed Nuclear Elements (SINEs), play a major role in eukaryotic genomes and are challenging to identify due to rapid mutations after insertion.
  • The Highly Divergent Repeat Search Method (HDRSM) outperformed the RepeatMasker program in identifying and accurately determining the boundaries of highly divergent SINE copies in the rice genome, revealing 14,030 hits – with 5,704 missed by RepeatMasker.
  • To achieve a complete understanding of SINE distribution, using both HDRSM and RepeatMasker is advised, as HDRSM excels in detecting divergent copies while RepeatMasker is more effective for shorter, more similar copies.
View Article and Find Full Text PDF

In this study, we developed a new mathematical method for performing multiple alignment of highly divergent sequences (MAHDS), i.e., sequences that have on average more than 2.

View Article and Find Full Text PDF

It is known that nucleotide sequences are not totally homogeneous and this heterogeneity could not be due to random fluctuations only. Such heterogeneity poses a problem of making sequence segmentation into a set of homogeneous parts divided by the points called "change points". In this work we investigated a special case of change points-paired change points (PCP).

View Article and Find Full Text PDF

Triplet periodicity (TP) is a distinctive feature of the protein coding sequences of both prokaryotic and eukaryotic genomes. In this work, we explored the TP difference inside and between 45 prokaryotic genomes. We constructed two hypotheses of TP distribution on a set of coding sequences and generated artificial datasets that correspond to the hypotheses.

View Article and Find Full Text PDF

To determine the periodicity of a DNA sequence, different spectral approaches are applied (discrete Fourier transform (DFT), autocorrelation (CORR), information decomposition (ID), hybrid method (HYB), concept of spectral envelope for spectral analysis (SE), normalized autocorrelation (CORR_N) and profile analysis (PA). In this work, we investigated the possibility of finding the true period length, by depending on the average number of accumulated changes in DNA bases (PM) for the methods stated above. The results show that for periods with short length (≤4 b.

View Article and Find Full Text PDF

The triplet periodicity (TP) is a distinguished property of protein coding sequences. There are complex genes with more than one TP type along their sequence. We say that these genes contain a triplet periodicity change point.

View Article and Find Full Text PDF