Background: In a number of domains, like in DNA microarray data analysis, we need to cluster simultaneously rows (genes) and columns (conditions) of a data matrix to identify groups of rows coherent with groups of columns. This kind of clustering is called biclustering. Biclustering algorithms are extensively used in DNA microarray data analysis. More effective biclustering algorithms are highly desirable and needed.

Methods: We introduce BiMine, a new enumeration algorithm for biclustering of DNA microarray data. The proposed algorithm is based on three original features. First, BiMine relies on a new evaluation function called Average Spearman's rho (ASR). Second, BiMine uses a new tree structure, called Bicluster Enumeration Tree (BET), to represent the different biclusters discovered during the enumeration process. Third, to avoid the combinatorial explosion of the search tree, BiMine introduces a parametric rule that allows the enumeration process to cut tree branches that cannot lead to good biclusters.

Results: The performance of the proposed algorithm is assessed using both synthetic and real DNA microarray data. The experimental results show that BiMine competes well with several other biclustering methods. Moreover, we test the biological significance using a gene annotation web-tool to show that our proposed method is able to produce biologically relevant biclusters. The software is available upon request from the authors to academic users.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2804695PMC
http://dx.doi.org/10.1186/1756-0381-2-9DOI Listing

Publication Analysis

Top Keywords

dna microarray
20
microarray data
20
algorithm based
8
bicluster enumeration
8
enumeration tree
8
data analysis
8
biclustering algorithms
8
proposed algorithm
8
enumeration process
8
biclustering
6

Similar Publications

Background: Familial hemiplegic migraine (FHM) types 1-3 are associated with protein-altering genetic variants in , and , respectively. These genes have also been linked to epilepsy. Previous studies primarily focused on phenotypes, examining genetic variants in individuals with characteristic FHM symptoms.

View Article and Find Full Text PDF

While the gluten-free diet (GFD) is primarily used to treat celiac disease (CD), recent research suggests it may also offer benefits for autoimmune-related diseases (ARDs), though findings remain inconsistent. This study aimed to investigate the potential protective effect of a GFD against ARDs by Mendelian Randomization (MR) analysis. Utilizing data from over 500,000 samples from the UK Biobank and other publicly available genome-wide association studies (GWAS), MR analysis revealed a significant negative causal relationship between GFD and the risk of developing rheumatoid arthritis (RA) (OR = 0.

View Article and Find Full Text PDF

This study aimed to analyze the impact of single nucleotide polymorphism (SNP) of (encoding adenylate cyclase 3) on the outcome of high-intensity interval training (HIIT) on body composition and screen genetic markers sensitive to HIIT in Chinese Han youth. A total of 237 non-regular exercise Han college students were recruited in a 12-week HIIT program, attending sessions 3 times a week. Before and after the HIIT program, their body composition was measured.

View Article and Find Full Text PDF

Background: Additional to total protein content, the amino acid (AA) profile is important to the nutritional value of soybean seed. The AA profile in soybean seed is a complex quantitative trait controlled by multiple interconnected genes and pathways controlling the accumulation of each AA. With a total of 621 soybean germplasm, we used three genome-wide association study (GWAS)-based approaches to investigate the genomic regions controlling the AA content and profile in soybean.

View Article and Find Full Text PDF

Machine learning (ML) has garnered significant attention for its potential to enhance the accuracy of genomic predictions (GPs) in various economic crops with the use of complete genomic information. Genome-wide association studies (GWAS) are widely used to pinpoint trait-related causal variant loci in genomes. However, the simultaneous integration of both methods for crop genome prediction necessitates further research.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!