Publications by authors named "Louis Wehenkel"

Motivation: Collaborative analysis of massive imaging datasets is essential to enable scientific discoveries.

Results: We developed Cytomine to foster active and distributed collaboration of multidisciplinary teams for large-scale image-based studies. It uses web development methodologies and machine learning in order to readily organize, explore, share and analyze (semantically and quantitatively) multi-gigapixel imaging data over the internet.

View Article and Find Full Text PDF

Background: The purpose of the MaxT algorithm is to provide a significance test algorithm that controls the family-wise error rate (FWER) during simultaneous hypothesis testing. However, the requirements in terms of computing time and memory of this procedure are proportional to the number of investigated hypotheses. The memory issue has been solved in 2013 by Van Lishout's implementation of MaxT, which makes the memory usage independent from the size of the dataset.

View Article and Find Full Text PDF

Teleost fish such as zebrafish (Danio rerio) are increasingly used for physiological, genetic and developmental studies. Our understanding of the physiological consequences of altered gravity in an entire organism is still incomplete. We used altered gravity and drug treatment experiments to evaluate their effects specifically on bone formation and more generally on whole genome gene expression.

View Article and Find Full Text PDF

Networks are ubiquitous in biology, and computational approaches have been largely investigated for their inference. In particular, supervised machine learning methods can be used to complete a partially known network by integrating various measurements. Two main supervised frameworks have been proposed: the local approach, which trains a separate model for each network node, and the global approach, which trains a single model over pairs of nodes.

View Article and Find Full Text PDF

Fine operating rules for security control and an automatic system for their online discovery were developed to adapt to the development of smart grids. The automatic system uses the real-time system state to determine critical flowgates, and then a continuation power flow-based security analysis is used to compute the initial transfer capability of critical flowgates. Next, the system applies the Monte Carlo simulations to expected short-term operating condition changes, feature selection, and a linear least squares fitting of the fine operating rules.

View Article and Find Full Text PDF

Zebrafish is increasingly used to assess biological properties of chemical substances and thus is becoming a specific tool for toxicological and pharmacological studies. The effects of chemical substances on embryo survival and development are generally evaluated manually through microscopic observation by an expert and documented by several typical photographs. Here, we present a methodology to automatically classify brightfield images of wildtype zebrafish embryos according to their defects by using an image analysis approach based on supervised machine learning.

View Article and Find Full Text PDF

The primary goal of genome-wide association studies (GWAS) is to discover variants that could lead, in isolation or in combination, to a particular trait or disease. Standard approaches to GWAS, however, are usually based on univariate hypothesis tests and therefore can account neither for correlations due to linkage disequilibrium nor for combinations of several markers. To discover and leverage such potential multivariate interactions, we propose in this work an extension of the Random Forest algorithm tailored for structured GWAS data.

View Article and Find Full Text PDF

Disordered regions, i.e., regions of proteins that do not adopt a stable three-dimensional structure, have been shown to play various and critical roles in many biological processes.

View Article and Find Full Text PDF

In this paper, we consider the batch mode reinforcement learning setting, where the central problem is to learn from a sample of trajectories a policy that satisfies or optimizes a performance criterion. We focus on the continuous state space case for which usual resolution schemes rely on function approximators either to represent the underlying control problem or to represent its value function. As an alternative to the use of function approximators, we rely on the synthesis of "artificial trajectories" from the given sample of trajectories, and show that this idea opens new avenues for designing and analyzing algorithms for batch mode reinforcement learning.

View Article and Find Full Text PDF

Background: Research in epistasis or gene-gene interaction detection for human complex traits has grown over the last few years. It has been marked by promising methodological developments, improved translation efforts of statistical epistasis to biological epistasis and attempts to integrate different omics information sources into the epistasis screening to enhance power. The quest for gene-gene interactions poses severe multiple-testing problems.

View Article and Find Full Text PDF

Disulfide bridges strongly constrain the native structure of many proteins and predicting their formation is therefore a key sub-problem of protein structure and function inference. Most recently proposed approaches for this prediction problem adopt the following pipeline: first they enrich the primary sequence with structural annotations, second they apply a binary classifier to each candidate pair of cysteines to predict disulfide bonding probabilities and finally, they use a maximum weight graph matching algorithm to derive the predicted disulfide connectivity pattern of a protein. In this paper, we adopt this three step pipeline and propose an extensive study of the relevance of various structural annotations and feature encodings.

View Article and Find Full Text PDF

Predicting a particular cognitive state from a specific pattern of fMRI voxel values is still a methodological challenge. Decoding brain activity is usually performed in highly controlled experimental paradigms characterized by a series of distinct states induced by a temporally constrained experimental design. In more realistic conditions, the number, sequence and duration of mental states are unpredictably generated by the individual, resulting in complex and imbalanced fMRI data sets.

View Article and Find Full Text PDF

Motivation: Univariate statistical tests are widely used for biomarker discovery in bioinformatics. These procedures are simple, fast and their output is easily interpretable by biologists but they can only identify variables that provide a significant amount of information in isolation from the other variables. As biological processes are expected to involve complex interactions between variables, univariate methods thus potentially miss some informative biomarkers.

View Article and Find Full Text PDF

Background: A large variety of proteins involved in inflammation, coagulation, lipid-oxidation and lipid metabolism have been associated with high-density lipoprotein (HDL) and it is anticipated that changes in the HDL proteome have implications for the multiple functions of HDL. Here, SELDI-TOF mass spectrometry (MS) was used to study the dynamic changes of HDL protein composition in a human experimental low-dose endotoxemia model. Ten healthy men with low HDL cholesterol (0.

View Article and Find Full Text PDF

Objective: Knee osteoarthritis (OA) is a heterogeneous, complex joint pathology of unknown aetiology. Biomarkers have been widely used to investigate OA but currently available biomarkers lack specificity and sensitivity. Therefore, novel biomarkers are needed to better understand the pathophysiological processes of OA initiation and progression.

View Article and Find Full Text PDF

One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs) using high throughput genomic data, in particular microarray gene expression data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge aims to evaluate the success of GRN inference algorithms on benchmarks of simulated data. In this article, we present GENIE3, a new algorithm for the inference of GRNs that was best performer in the DREAM4 In Silico Multifactorial challenge.

View Article and Find Full Text PDF

At the intersection between artificial intelligence and statistics, supervised learning allows algorithms to automatically build predictive models from just observations of a system. During the last twenty years, supervised learning has been a tool of choice to analyze the always increasing and complexifying data generated in the context of molecular biology, with successful applications in genome annotation, function prediction, or biomarker discovery. Among supervised learning methods, decision tree-based methods stand out as non parametric methods that have the unique feature of combining interpretability, efficiency, and, when used in ensembles of trees, excellent accuracy.

View Article and Find Full Text PDF

Asthma is a complex inflammatory disease of airways. A network of reciprocal interactions between inflammatory cells, peptidic mediators, extracellular matrix components, and proteases is thought to be involved in the installation and maintenance of asthma-related airway inflammation and remodeling. To date, new proteic mediators displaying significant activity in the pathophysiology of asthma are still to be unveiled.

View Article and Find Full Text PDF

This paper compares reinforcement learning (RL) with model predictive control (MPC) in a unified framework and reports experimental results of their application to the synthesis of a controller for a nonlinear and deterministic electrical power oscillations damping problem. Both families of methods are based on the formulation of the control problem as a discrete-time optimal control problem. The considered MPC approach exploits an analytical model of the system dynamics and cost function and computes open-loop policies by applying an interior-point solver to a minimization problem in which the system dynamics are represented by equality constraints.

View Article and Find Full Text PDF

This supplement contains extended versions of a selected subset of papers presented at the workshop MLSB 2007, Machine Learning in Systems Biology, Evry, France, from September 24 to 25, 2007.

View Article and Find Full Text PDF

Objectives: Infliximab is the first anti-TNFalpha accepted by the Food and Drug Administration for use in inflammatory bowel disease treatment. Few clinical, biological and genetic factors tend to predict response in Crohn's disease (CD) patient subcategories, none widely predicting response to infliximab.

Design And Methods: Twenty CD patients showing clinical response or non response to infliximab were used for serum proteomic profiling on Surface Enhanced Lazer Desorption Ionisation-Time of Flight-Mass Spectrometry (SELDI-TOF-MS), each before and after treatment.

View Article and Find Full Text PDF

Background: SELDI-TOF mass spectrometry (MS) is a high-throughput proteomic approach with potential for identifying novel forms of serum biomarkers of arthritis.

Methods: We used SELDI-TOF MS to analyze serum samples from patients with various forms of inflammatory arthritis. Several protein profiles were collected on different Bio-Rad Laboratories ProteinChip arrays (CM10 and IMAC-Cu(2+)) and were evaluated statistically to select potential biomarkers.

View Article and Find Full Text PDF

Background: With the improvements in biosensors and high-throughput image acquisition technologies, life science laboratories are able to perform an increasing number of experiments that involve the generation of a large amount of images at different imaging modalities/scales. It stresses the need for computer vision methods that automate image classification tasks.

Results: We illustrate the potential of our image classification method in cell biology by evaluating it on four datasets of images related to protein distributions or subcellular localizations, and red-blood cell shapes.

View Article and Find Full Text PDF

In the post-genomic area, the prediction of transcription factor regulons by position weight matrix-based programmes is a powerful approach to decipher biological pathways and to modelize regulatory networks in bacteria. The main difficulty once a regulon prediction is available is to estimate its reliability prior to start expensive experimental validations and therefore trying to find a way how to identify true positive hits from an endless list of potential target genes of a regulatory protein. Here we introduce PREDetector (Prokaryotic Regulatory Elements Detector), a tool developed for predicting regulons of DNA-binding proteins in bacterial genomes that, beside the automatic prediction, scoring and positioning of potential binding sites and their respective target genes in annotated bacterial genomes, it also provides an easy way to estimate the thresholds where to find reliable possible new target genes.

View Article and Find Full Text PDF

Crohn's disease and ulcerative colitis known as inflammatory bowel diseases (IBD) are chronic immuno-inflammatory pathologies of the gastrointestinal tract. These diseases are multifactorial, polygenic and of unknown etiology. Clinical presentation is non-specific and diagnosis is based on clinical, endoscopic, radiological and histological criteria.

View Article and Find Full Text PDF