The coupled Wright-Fisher diffusion is a multi-dimensional Wright-Fisher diffusion for multi-locus and multi-allelic genetic frequencies, expressed as the strong solution to a system of stochastic differential equations that are coupled in the drift, where the pairwise interaction among loci is modelled by an inter-locus selection. In this paper, an ancestral process, which is dual to the coupled Wright-Fisher diffusion, is derived. The dual process corresponds to the block counting process of coupled ancestral selection graphs, one for each locus.
View Article and Find Full Text PDFIn an attempt to map chromosomal regions carrying rare gene variants contributing to the risk of multiple sclerosis (MS), we identified segments shared identical-by-descent (IBD) using the software BEAGLE 4.0's refined IBD analysis. IBD mapping aims at identifying segments inherited from a common ancestor and shared more frequently in case-case pairs.
View Article and Find Full Text PDFWe introduce a Bayesian theoretical formulation of the statistical learning problem concerning the genetic structure of populations. The two key concepts in our derivation are exchangeability in its various forms and random allocation models. Implications of our results to empirical investigation of the population structure are discussed.
View Article and Find Full Text PDFA molecular interaction library modeling favorable non-bonded interactions between atoms and molecular fragments is considered. In this paper, we represent the structure of the interaction library by a network diagram, which demonstrates that the underlying prediction model obtained for a molecular fragment is multi-layered. We clustered the molecular fragments into four groups by analyzing the pairwise distances between the molecular fragments.
View Article and Find Full Text PDFMinimization of stochastic complexity (SC) was used as a method for classification of genotypic fingerprints. The method was applied to fluorescent amplified fragment length polymorphism (fAFLP) fingerprint patterns of 507 Vibrionaceae representatives. As the current BinClass implementation of the optimization algorithm for classification only works on binary vectors, the original fingerprints were discretized in a preliminary step using the sliding-window band-matching method, in order to maximally preserve the information content of the original band patterns.
View Article and Find Full Text PDFMicrobiologists have traditionally applied hierarchical clustering algorithms as their mathematical tool of choice to unravel the taxonomic relationships between micro-organisms. However, the interpretation of such hierarchical classifications suffers from being subjective, in that a variety of ad hoc choices must be made during their construction. On the other hand, the application of more profound and objective mathematical methods--such as the minimization of stochastic complexity--for the classification of bacterial genotyping fingerprints data is hampered by the prerequisite that such methods only act upon vectorized data.
View Article and Find Full Text PDFWe describe a library of molecular fragments designed to model and predict non-bonded interactions between atoms. We apply the Bayesian approach, whereby prior knowledge and uncertainty of the mathematical model are incorporated into the estimated model and its parameters. The molecular interaction data are strengthened by narrowing the atom classification to 14 atom types, focusing on independent molecular contacts that lie within a short cutoff distance, and symmetrizing the interaction data for the molecular fragments.
View Article and Find Full Text PDFWe apply minimization of stochastic complexity and the closely related method of cumulative classification to analyse the extensively studied BIOLOG GN data of Vibrio spp. Minimization of stochastic complexity provides an objective tool of bacterial taxonomy as it produces classifications that are optimal from the point of view of information theory. We compare the outcome of our results with previously published classifications of the same data set.
View Article and Find Full Text PDFWe present a theory of classification and predictive identification of bacteria. Bacterial strains are characterized by a binary vector and the taxonomy is specified by attaching a label to each vector. The theory is developed from only two basic assumptions, viz.
View Article and Find Full Text PDF