Bi-Force: large-scale bicluster editing and its application to gene expression data biclustering.

Nucleic Acids Res

Max Planck Institute for Informatics, Campus E1 4, Saarland University, 66123 Saarbrücken, Germany Institute for Mathematics and Computer Science, University of Southern Denmark, Campusvej 55, 5230 Odense M, Denmark.

Published: May 2014

The explosion of the biological data has dramatically reformed today's biological research. The need to integrate and analyze high-dimensional biological data on a large scale is driving the development of novel bioinformatics approaches. Biclustering, also known as 'simultaneous clustering' or 'co-clustering', has been successfully utilized to discover local patterns in gene expression data and similar biomedical data types. Here, we contribute a new heuristic: 'Bi-Force'. It is based on the weighted bicluster editing model, to perform biclustering on arbitrary sets of biological entities, given any kind of pairwise similarities. We first evaluated the power of Bi-Force to solve dedicated bicluster editing problems by comparing Bi-Force with two existing algorithms in the BiCluE software package. We then followed a biclustering evaluation protocol in a recent review paper from Eren et al. (2013) (A comparative analysis of biclustering algorithms for gene expressiondata. Brief. Bioinform., 14:279-292.) and compared Bi-Force against eight existing tools: FABIA, QUBIC, Cheng and Church, Plaid, BiMax, Spectral, xMOTIFs and ISA. To this end, a suite of synthetic datasets as well as nine large gene expression datasets from Gene Expression Omnibus were analyzed. All resulting biclusters were subsequently investigated by Gene Ontology enrichment analysis to evaluate their biological relevance. The distinct theoretical foundation of Bi-Force (bicluster editing) is more powerful than strict biclustering. We thus outperformed existing tools with Bi-Force at least when following the evaluation protocols from Eren et al. Bi-Force is implemented in Java and integrated into the open source software package of BiCluE. The software as well as all used datasets are publicly available at http://biclue.mpi-inf.mpg.de.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5769343PMC
http://dx.doi.org/10.1093/nar/gku201DOI Listing

Publication Analysis

Top Keywords

bicluster editing
16
gene expression
16
expression data
8
biological data
8
bi-force existing
8
biclue software
8
software package
8
eren et al
8
existing tools
8
bi-force
7

Similar Publications

We present a novel approach to identify human microRNA (miRNA) regulatory modules (mRNA targets and relevant cell conditions) by biclustering a large collection of mRNA fold-change data for sequence-specific targets. Bicluster targets were assessed using validated messenger RNA (mRNA) targets and exhibited on an average 17.0% (median 19.

View Article and Find Full Text PDF

Bi-Force: large-scale bicluster editing and its application to gene expression data biclustering.

Nucleic Acids Res

May 2014

Max Planck Institute for Informatics, Campus E1 4, Saarland University, 66123 Saarbrücken, Germany Institute for Mathematics and Computer Science, University of Southern Denmark, Campusvej 55, 5230 Odense M, Denmark.

The explosion of the biological data has dramatically reformed today's biological research. The need to integrate and analyze high-dimensional biological data on a large scale is driving the development of novel bioinformatics approaches. Biclustering, also known as 'simultaneous clustering' or 'co-clustering', has been successfully utilized to discover local patterns in gene expression data and similar biomedical data types.

View Article and Find Full Text PDF

Background: The explosion of biological data has dramatically reformed today's biology research. The biggest challenge to biologists and bioinformaticians is the integration and analysis of large quantity of data to provide meaningful insights. One major problem is the combined analysis of data from different types.

View Article and Find Full Text PDF

The explosion of biological data has largely influenced the focus of today’s biology research. Integrating and analysing large quantity of data to provide meaningful insights has become the main challenge to biologists and bioinformaticians. One major problem is the combined data analysis of data from different types, such as phenotypes and genotypes.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!