Publications by authors named "Veronica Vinciotti"

H2A.Z is a H2A-type histone variant essential for many aspects of cell biology, ranging from gene expression to genome stability. From deuterostomes, H2A.

View Article and Find Full Text PDF

Graphical lasso is one of the most used estimators for inferring genetic networks. Despite its diffusion, there are several fields in applied research where the limits of detection of modern measurement technologies make the use of this estimator theoretically unfounded, even when the assumption of a multivariate Gaussian distribution is satisfied. Typical examples are data generated by polymerase chain reactions and flow cytometer.

View Article and Find Full Text PDF

Regression for count data is widely performed by models such as Poisson, negative binomial (NB) and zero-inflated regression. A challenge often faced by practitioners is the selection of the right model to take into account dispersion, which typically occurs in count datasets. It is highly desirable to have a unified model that can automatically adapt to the underlying dispersion and that can be easily implemented in practice.

View Article and Find Full Text PDF

Chromatin ImmunoPrecipitation-sequencing (ChIP-seq) experiments have now become routine in biology for the detection of protein binding sites. In this chapter, we show how hidden Markov models can be used for the analysis of data generated by ChIP-seq experiments. We show how a hidden Markov model can naturally account for spatial dependencies in the ChIP-seq data, how it can be used in the presence of data from multiple ChIP-seq experiments under the same biological condition, and how it naturally accounts for the different IP efficiencies of individual ChIP-seq experiments.

View Article and Find Full Text PDF

Repo-Man is a protein phosphatase 1 (PP1) targeting subunit that regulates mitotic progression and chromatin remodelling. After mitosis, Repo-Man/PP1 remains associated with chromatin but its function in interphase is not known. Here we show that Repo-Man, via Nup153, is enriched on condensed chromatin at the nuclear periphery and at the edge of the nucleopore basket.

View Article and Find Full Text PDF

Background: Network enrichment analysis is a powerful method, which allows to integrate gene enrichment analysis with the information on relationships between genes that is provided by gene networks. Existing tests for network enrichment analysis deal only with undirected networks, they can be computationally slow and are based on normality assumptions.

Results: We propose NEAT, a test for network enrichment analysis.

View Article and Find Full Text PDF

Background: Sparse Gaussian graphical models are popular for inferring biological networks, such as gene regulatory networks. In this paper, we investigate the consistency of these models across different data platforms, such as microarray and next generation sequencing, on the basis of a rich dataset containing samples that are profiled under both techniques as well as a large set of independent samples.

Results: Our analysis shows that individual node variances can have a remarkable effect on the connectivity of the resulting network.

View Article and Find Full Text PDF

Factorial Gaussian graphical Models (fGGMs) have recently been proposed for inferring dynamic gene regulatory networks from genomic high-throughput data. In the search for true regulatory relationships amongst the vast space of possible networks, these models allow the imposition of certain restrictions on the dynamic nature of these relationships, such as Markov dependencies of low order - some entries of the precision matrix are a priori zeros - or equal dependency strengths across time lags - some entries of the precision matrix are assumed to be equal. The precision matrix is then estimated by l1-penalized maximum likelihood, imposing a further constraint on the absolute value of its entries, which results in sparse networks.

View Article and Find Full Text PDF

Chromatin ImmunoPrecipitation-sequencing (ChIP-seq) experiments have now become routine in biology for the detection of protein-binding sites. In this paper, we present a Markov random field model for the joint analysis of multiple ChIP-seq experiments. The proposed model naturally accounts for spatial dependencies in the data, by assuming first-order Markov dependence and, for the large proportion of zero counts, by using zero-inflated mixture distributions.

View Article and Find Full Text PDF

Background: ImmunoPrecipitation (IP) efficiencies may vary largely between different antibodies and between repeated experiments with the same antibody. These differences have a large impact on the quality of ChIP-seq data: a more efficient experiment will necessarily lead to a higher signal to background ratio, and therefore to an apparent larger number of enriched regions, compared to a less efficient experiment. In this paper, we show how IP efficiencies can be explicitly accounted for in the joint statistical modelling of ChIP-seq data.

View Article and Find Full Text PDF

Gene regulatory networks give important insights into the mechanisms underlying physiology and pathophysiology. The derivation of gene regulatory networks from high-throughput expression data via machine learning strategies is problematic as the reliability of these models is often compromised by limited and highly variable samples, heterogeneity in transcript isoforms, noise, and other artifacts. Here, we develop a novel algorithm, dubbed Dandelion, in which we construct and train intraspecies Bayesian networks that are translated and assessed on independent test sets from other species in a reiterative procedure.

View Article and Find Full Text PDF

Background: The feminization of nature by endocrine-disrupting chemicals (EDCs) is a key environmental issue affecting both terrestrial and aquatic wildlife. A crucial and as yet unanswered question is whether EDCs have adverse impacts on the sustainability of wildlife populations. There is widespread concern that intersex fish are reproductively compromised, with potential population-level consequences.

View Article and Find Full Text PDF

In this paper, we explore the use of M-quantile regression and M-quantile coefficients to detect statistical differences between temporal curves that belong to different experimental conditions. In particular, we consider the application of temporal gene expression data. Here, the aim is to detect genes whose temporal expression is significantly different across a number of biological conditions.

View Article and Find Full Text PDF

In this paper, the extended Kalman filter (EKF) algorithm is applied to model the gene regulatory network from gene time series data. The gene regulatory network is considered as a nonlinear dynamic stochastic model that consists of the gene measurement equation and the gene regulation equation. After specifying the model structure, we apply the EKF algorithm for identifying both the model parameters and the actual value of gene expression levels.

View Article and Find Full Text PDF

MicroRNAs (miRNAs) have recently emerged as a new complex layer of gene regulation. MiRNAs act post-transcriptionally, influencing the stability, compartmentalization, and translation of their target mRNAs. Computational efforts to understand the post-transcriptional gene regulation by miRNAs have been focused on the target prediction tools, while quantitative kinetic models of gene regulation by miRNAs have so far largely been overlooked.

View Article and Find Full Text PDF

Purpose: To examine the relationship between an anatomic map relating the retinal nerve fiber layer (RNFL) distribution to the optic nerve head and a functional map derived from the interpoint correlation of raw sensitivities in visual field (VF) testing.

Methods: Previously, interpoint correlations were generated for all possible pairs of VF test points in a dataset of 98,821 Humphrey VF test results taken from the Moorfields Eye Hospital archive. The relationship between these correlations and the physical distance between the VF test point pairs was evaluated by Pearson's correlation coefficient and multiple regression analysis.

View Article and Find Full Text PDF

Background: The identification of biologically interesting genes in a temporal expression profiling dataset is challenging and complicated by high levels of experimental noise. Most statistical methods used in the literature do not fully exploit the temporal ordering in the dataset and are not suited to the case where temporal profiles are measured for a number of different biological conditions. We present a statistical test that makes explicit use of the temporal order in the data by fitting polynomial functions to the temporal profile of each gene and for each biological condition.

View Article and Find Full Text PDF

Identifying genes that direct the mechanism of a disease from expression data is extremely useful in understanding how that mechanism works. This in turn may lead to better diagnoses and potentially could lead to a cure for that disease. This task becomes extremely challenging when the data are characterised by only a small number of samples and a high number of dimensions, as is often the case with gene expression data.

View Article and Find Full Text PDF

Objective: Progressive loss of the field of vision is characteristic of a number of eye diseases such as glaucoma which is a leading cause of irreversible blindness in the world. Recently, there has been an explosion in the amount of data being stored on patients who suffer from visual deterioration including field test data, retinal image data and patient demographic data. However, there has been relatively little work in modelling the spatial and temporal relationships common to such data.

View Article and Find Full Text PDF

Microarray analysis using clustering algorithms can suffer from lack of inter-method consistency in assigning related gene-expression profiles to clusters. Obtaining a consensus set of clusters from a number of clustering methods should improve confidence in gene-expression analysis. Here we introduce consensus clustering, which provides such an advantage.

View Article and Find Full Text PDF