ParBiBit: Parallel tool for binary biclustering on modern distributed-memory systems.

PLoS One

Grupo de Arquitectura de Computadores, Universidade da Coruña, A Coruña, Spain.

Published: July 2018

Biclustering techniques are gaining attention in the analysis of large-scale datasets as they identify two-dimensional submatrices where both rows and columns are correlated. In this work we present ParBiBit, a parallel tool to accelerate the search of interesting biclusters on binary datasets, which are very popular on different fields such as genetics, marketing or text mining. It is based on the state-of-the-art sequential Java tool BiBit, which has been proved accurate by several studies, especially on scenarios that result on many large biclusters. ParBiBit uses the same methodology as BiBit (grouping the binary information into patterns) and provides the same results. Nevertheless, our tool significantly improves performance thanks to an efficient implementation based on C++11 that includes support for threads and MPI processes in order to exploit the compute capabilities of modern distributed-memory systems, which provide several multicore CPU nodes interconnected through a network. Our performance evaluation with 18 representative input datasets on two different eight-node systems shows that our tool is significantly faster than the original BiBit. Source code in C++ and MPI running on Linux systems as well as a reference manual are available at https://sourceforge.net/projects/parbibit/.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5880350PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0194361PLOS

Publication Analysis

Top Keywords

parbibit parallel
8
parallel tool
8
modern distributed-memory
8
distributed-memory systems
8
tool
5
tool binary
4
binary biclustering
4
biclustering modern
4
systems
4
systems biclustering
4

Similar Publications

ParBiBit: Parallel tool for binary biclustering on modern distributed-memory systems.

PLoS One

July 2018

Grupo de Arquitectura de Computadores, Universidade da Coruña, A Coruña, Spain.

Biclustering techniques are gaining attention in the analysis of large-scale datasets as they identify two-dimensional submatrices where both rows and columns are correlated. In this work we present ParBiBit, a parallel tool to accelerate the search of interesting biclusters on binary datasets, which are very popular on different fields such as genetics, marketing or text mining. It is based on the state-of-the-art sequential Java tool BiBit, which has been proved accurate by several studies, especially on scenarios that result on many large biclusters.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!