In this paper, by extending some results of informational genomics, we present a new randomness test based on the empirical entropy of strings and some properties of the repeatability and unrepeatability of substrings of certain lengths. We give the theoretical motivations of our method and some experimental results of its application to a wide class of strings: decimal representations of real numbers, roulette outcomes, logistic maps, linear congruential generators, quantum measurements, natural language texts, and genomes. It will be evident that the evaluation of randomness resulting from our tests does not distinguish among the different sources of randomness (natural, or pseudo-casual).
View Article and Find Full Text PDFBMC Bioinformatics
November 2018
Background: Pan-genome approaches afford the discovery of homology relations in a set of genomes, by determining how some gene families are distributed among a given set of genomes. The retrieval of a complete gene distribution among a class of genomes is an NP-hard problem because computational costs increase with the number of analyzed genomes, in fact, all-against-all gene comparisons are required to completely solve the problem. In presence of phylogenetically distant genomes, due to the variability introduced in gene duplication and transmission, the task of recognizing homologous genes becomes even more difficult.
View Article and Find Full Text PDFIn recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes.
View Article and Find Full Text PDFUnlabelled: MpTheory Java library is an open-source project collecting a set of objects and algorithms for modeling observed dynamics by means of the Metabolic P (MP) theory, that is, a mathematical theory introduced in 2004 for modeling biological dynamics. By means of the library, it is possible to model biological systems both at continuous and at discrete time. Moreover, the library comprises a set of regression algorithms for inferring MP models starting from time series of observations.
View Article and Find Full Text PDFMP-GeneticSynth is a Java tool for discovering the logic and regulation mechanisms responsible for observed biological dynamics in terms of finite difference recurrent equations. The software makes use of: (i) metabolic P systems as a modeling framework, (ii) an evolutionary approach to discover flux regulation functions as linear combinations of given primitive functions, (iii) a suitable reformulation of the least squares method to estimate function parameters considering simultaneously all the reactions involved in complex dynamics. The tool is available as a plugin for the virtual laboratory MetaPlab.
View Article and Find Full Text PDFIn this paper we present a new methodology, based on genetic algorithms and multiple linear regression, for discovering regulation mechanisms responsible for observed time series in biological networks. The modeling framework employed is called Metabolic P systems; they are deterministic and time-discrete dynamical systems proposed as an effective alternative to ordinary differential equations for modeling biochemical systems. Our methodology is here successfully applied to the mitotic oscillator in early amphibian embryos.
View Article and Find Full Text PDFBackground: In the post-genomic era several methods of computational genomics are emerging to understand how the whole information is structured within genomes. Literature of last five years accounts for several alignment-free methods, arisen as alternative metrics for dissimilarity of biological sequences. Among the others, recent approaches are based on empirical frequencies of DNA k-mers in whole genomes.
View Article and Find Full Text PDFA mathematical notation is introduced to represent, at a symbolic level, different mechanisms of DNA recombination, and a 'PCR lemma' is proven by analytically describing the combinatorial properties of the polymerase chain reaction process. This approach led to the discovery of novel techniques, based on a form of PCR which we called cross pairing PCR (briefly XPCR). They were mathematically analyzed and already experimentally proven in different contexts, such as DNA extraction and recombination.
View Article and Find Full Text PDFThe metabolic P algorithm is a procedure which determines, in a biochemically realistic way, the evolution of P systems representing biological phenomena. A new formulation of this algorithm is given and a graphical formalism is introduced which seems to be very natural in expressing biological networks by means of a two level representation: a basic biochemical level and a second one which regulates the dynamical interaction among the reactions of the first level. After some basic examples, the mitotic oscillator in amphibian embryos is considered as an important case study.
View Article and Find Full Text PDFA dynamical analysis of P systems is given that is focused on basic phenomena of biological relevance. After a short presentation of a new kind of P systems (PB systems), membrane systems with environment, called PBE systems, are introduced that are more suitable for modeling complex membrane interactions. Some types of periodicity and non-periodicity are considered for PBE systems by showing some "minimal" examples of systems that exhibit these properties.
View Article and Find Full Text PDF