The continuous-time Markov chain (CTMC) is the mathematical workhorse of evolutionary biology. Learning CTMC model parameters using modern, gradient-based methods requires the derivative of the matrix exponential evaluated at the CTMC's infinitesimal generator (rate) matrix. Motivated by the derivative's extreme computational complexity as a function of state space cardinality, recent work demonstrates the surprising effectiveness of a naive, first-order approximation for a host of problems in computational biology.
View Article and Find Full Text PDFPhylogenetic comparative methods use random processes, such as the Brownian Motion, to model the evolution of continuous traits on phylogenetic trees. Growing evidence for non-gradual evolution motivated the development of complex models, often based on Lévy processes. However, their statistical inference is computationally intensive and currently relies on approximations, high-dimensional sampling, or numerical integration.
View Article and Find Full Text PDFAnnu Int Conf IEEE Eng Med Biol Soc
July 2022
Because drowsiness is a major cause in vehicle accidents, its automated detection is critical. Scale-free temporal dynamics is known to be typical of physiological and body rhythms. The present work quantifies the benefits of applying a recent and original multivariate selfsimilarity analysis to several modalities of polysomnographic measurements (heart rate, blood pressure, electroencephalogram and respiration), from the MIT-BIH Polysomnographic Database, to better classify drowsiness-related sleep stages.
View Article and Find Full Text PDFPost-translational modifications (PTMs) are ubiquitous and essential for protein function and signaling, motivating the need for sustainable benefit and open models of web databases. Highly conserved O-GlcNAcylation is a case example of one of the most recently discovered PTMs, investigated by a growing community. Historically, details about O-GlcNAcylated proteins and sites were dispersed across literature and in non-O-GlcNAc-focused, rapidly outdated or now defunct web databases.
View Article and Find Full Text PDFBeing confounding factors, directional trends are likely to make two quantitative traits appear as spuriously correlated. By determining the probability distributions of independent contrasts when traits evolve following Brownian motions with linear trends, we show that the standard independent contrasts can not be used to test for correlation in this situation. We propose a multiple regression approach which corrects the bias caused by directional evolution.
View Article and Find Full Text PDFA tracer particle is called anomalously diffusive if its mean squared displacement grows approximately as as a function of time for some constant , where the diffusion exponent satisfies ≠ 1. In this article, we use recent results on the asymptotic distribution of the time-averaged mean squared displacement [20] to construct statistical tests for detecting physical heterogeneity in viscoelastic fluid samples starting from one or multiple observed anomalously diffusive paths. The methods are asymptotically valid for the range 0 < < 3/2 and involve a mathematical characterization of time-averaged mean squared displacement bias and the effect of correlated disturbance errors.
View Article and Find Full Text PDFThe identification of communities, or modules, is a common operation in the analysis of large biological networks. The established a framework to evaluate clustering approaches in a biomedical context, by testing the association of communities with GWAS-derived common trait and disease genes. We implemented here several extensions of the MolTi software that detects communities by optimizing multiplex (and monoplex) network modularity.
View Article and Find Full Text PDFBull Math Biol
October 2017
The time-dependent-asymmetric-linear parsimony is an ancestral state reconstruction method which extends the standard linear parsimony (a.k.a.
View Article and Find Full Text PDFChoosing an ancestral state reconstruction method among the alternatives available for quantitative characters may be puzzling. We present here a comparison of seven of them, namely the maximum likelihood, restricted maximum likelihood, generalized least squares under Brownian, Brownian-with-trend and Ornstein-Uhlenbeck models, phylogenetic independent contrasts and squared parsimony methods. A review of the relations between these methods shows that the maximum likelihood, the restricted maximum likelihood and the generalized least squares under Brownian model infer the same ancestral states and can only be distinguished by the distributions accounting for the reconstruction uncertainty which they provide.
View Article and Find Full Text PDFVarious biological networks can be constructed, each featuring gene/protein relationships of different meanings (e.g., protein interactions or gene co-expression).
View Article and Find Full Text PDFDespite its intrinsic difficulty, ancestral character state reconstruction is an essential tool for testing evolutionary hypothesis. Two major classes of approaches to this question can be distinguished: parsimony- or likelihood-based approaches. We focus here on the second class of methods, more specifically on approaches based on continuous-time Markov modeling of character evolution.
View Article and Find Full Text PDFBackground: While multiple alignment is the first step of usual classification schemes for biological sequences, alignment-free methods are being increasingly used as alternatives when multiple alignments fail. Subword-based combinatorial methods are popular for their low algorithmic complexity (suffix trees ..
View Article and Find Full Text PDFBackground: As public microarray repositories are constantly growing, we are facing the challenge of designing strategies to provide productive access to the available data.
Methodology: We used a modified version of the Markov clustering algorithm to systematically extract clusters of co-regulated genes from hundreds of microarray datasets stored in the Gene Expression Omnibus database (n = 1,484). This approach led to the definition of 18,250 transcriptional signatures (TS) that were tested for functional enrichment using the DAVID knowledgebase.
Background: We present the N-map method, a pairwise and asymmetrical approach which allows us to compare sequences by taking into account evolutionary events that produce shuffled, reversed or repeated elements. Basically, the optimal N-map of a sequence s over a sequence t is the best way of partitioning the first sequence into N parts and placing them, possibly complementary reversed, over the second sequence in order to maximize the sum of their gapless alignment scores.
Results: We introduce an algorithm computing an optimal N-map with time complexity O (|s| x |t| x N) using O (|s| x |t| x N) memory space.
Background: In general, the construction of trees is based on sequence alignments. This procedure, however, leads to loss of informationwhen parts of sequence alignments (for instance ambiguous regions) are deleted before tree building. To overcome this difficulty, one of us previously introduced a new and rapid algorithm that calculates dissimilarity matrices between sequences without preliminary alignment.
View Article and Find Full Text PDF