Cardiometabolic health is complex and characterized by an ensemble of correlated and/or co-occurring conditions including obesity, dyslipidemia, hypertension, and diabetes mellitus. It is affected by social, lifestyle, and environmental factors, which in-turn exhibit complex correlation patterns. To account for the complexity of (i) exposure profiles and (ii) health outcomes, we propose to use a multitrait Bayesian variable selection approach and identify a sparse set of exposures jointly explanatory of the complex cardiometabolic health status.
View Article and Find Full Text PDFJ R Stat Soc Ser C Appl Stat
November 2023
Stability selection represents an attractive approach to identify sparse sets of features jointly associated with an outcome in high-dimensional contexts. We introduce an automated calibration procedure via maximisation of an in-house stability score and accommodating a priori-known block structure (e.g.
View Article and Find Full Text PDFPlant defense responses involve several biological processes that allow plants to fight against pathogenic attacks. How these different processes are orchestrated within organs and depend on specific cell types is poorly known. Here, using single-cell RNA sequencing (scRNA-seq) technology on three independent biological replicates, we identified several cell populations representing the core transcriptional responses of wild-type Arabidopsis leaves inoculated with the bacterial pathogen Pseudomonas syringae DC3000.
View Article and Find Full Text PDFThe relative importance of ecological factors and species interactions for shaping species distributions is still debated. The realised niches of eight sympatric tephritid fruit flies were inferred from field abundance data using joint species distribution modelling and network inference, on the whole community and separately on three host plant groups. These estimates were then confronted the fundamental niches of seven fly species estimated through laboratory-measured fitnesses on host plants.
View Article and Find Full Text PDFJ Bioinform Comput Biol
February 2021
In many cancers, mechanisms of gene regulation can be severely altered. Identification of deregulated genes, which do not follow the regulation processes that exist between transcription factors and their target genes, is of importance to better understand the development of the disease. We propose a methodology to detect deregulation mechanisms with a particular focus on cancer subtypes.
View Article and Find Full Text PDFMotivation: Association studies have been widely used to search for associations between common genetic variants observations and a given phenotype. However, it is now generally accepted that genes and environment must be examined jointly when estimating phenotypic variance. In this work we consider two types of biological markers: genotypic markers, which characterize an observation in terms of inherited genetic information, and metagenomic marker which are related to the environment.
View Article and Find Full Text PDFBackground: In unsupervised learning and clustering, data integration from different sources and types is a difficult question discussed in several research areas. For instance in omics analysis, dozen of clustering methods have been developed in the past decade. When a single source of data is at play, hierarchical clustering (HC) is extremely popular, as a tree structure is highly interpretable and arguably more informative than just a partition of the data.
View Article and Find Full Text PDFCell-cell communication involves a large number of molecular signals that function as words of a complex language whose grammar remains mostly unknown. Here, we describe an integrative approach involving (1) protein-level measurement of multiple communication signals coupled to output responses in receiving cells and (2) mathematical modeling to uncover input-output relationships and interactions between signals. Using human dendritic cell (DC)-T helper (Th) cell communication as a model, we measured 36 DC-derived signals and 17 Th cytokines broadly covering Th diversity in 428 observations.
View Article and Find Full Text PDFThis chapter addresses the problem of reconstructing regulatory networks in molecular biology by integrating multiple sources of data. We consider data sets measured from diverse technologies all related to the same set of variables and individuals. This situation is becoming more and more common in molecular biology, for instance, when both proteomic and transcriptomic data related to the same set of "genes" are available on a given cohort of patients.
View Article and Find Full Text PDFOmic data are characterized by the presence of strong dependence structures that result either from data acquisition or from some underlying biological processes. Applying statistical procedures that do not adjust the variable selection step to the dependence pattern may result in a loss of power and the selection of spurious variables. The goal of this paper is to propose a variable selection procedure within the multivariate linear model framework that accounts for the dependence between the multiple responses.
View Article and Find Full Text PDFIn tumoral cells, gene regulation mechanisms are severely altered. Genes that do not react normally to their regulators' activity can provide explanations for the tumoral behavior, and be characteristic of cancer subtypes. We thus propose a statistical methodology to identify the misregulated genes given a reference network and gene expression data.
View Article and Find Full Text PDFOilseed rape (Brassica napus L.) was formed ~7500 years ago by hybridization between B. rapa and B.
View Article and Find Full Text PDFThe reprogramming of gene expression appears as the major trend in synthetic and natural allopolyploids where expression of an important proportion of genes was shown to deviate from that of the parents or the average of the parents. In this study, we analyzed gene expression changes in previously reported, highly stable synthetic wheat allohexaploids that combine the D genome of Aegilops tauschii and the AB genome extracted from the natural hexaploid wheat Triticum aestivum. A comprehensive genome-wide analysis of transcriptional changes using the Affymetrix GeneChip Wheat Genome Array was conducted.
View Article and Find Full Text PDFStat Appl Genet Mol Biol
May 2010
We present a weighted-LASSO method to infer the parameters of a first-order vector auto-regressive model that describes time course expression data generated by directed gene-to-gene regulation networks. These networks are assumed to own prior internal structures of connectivity which drive the inference method. This prior structure can be either derived from prior biological knowledge or inferred by the method itself.
View Article and Find Full Text PDFSummary: The R package SIMoNe (Statistical Inference for MOdular NEtworks) enables inference of gene-regulatory networks based on partial correlation coefficients from microarray experiments. Modelling gene expression data with a Gaussian graphical model (hereafter GGM), the algorithm estimates non-zero entries of the concentration matrix, in a sparse and possibly high-dimensional setting. Its originality lies in the fact that it searches for a latent modular structure to drive the inference procedure through adaptive penalization of the concentration matrix.
View Article and Find Full Text PDF