Biclustering, the simultaneous clustering of rows and columns of a data matrix, has proved its effectiveness in bioinformatics due to its capacity to produce local instead of global models, evolving from a key technique used in gene expression data analysis into one of the most used approaches for pattern discovery and identification of biological modules, used in both descriptive and predictive learning tasks. This survey presents a comprehensive overview of biclustering. It proposes an updated taxonomy for its fundamental components (bicluster, biclustering solution, biclustering algorithms, and evaluation measures) and applications. We unify scattered concepts in the literature with new definitions to accommodate the diversity of data types (such as tabular, network, and time series data) and the specificities of biological and biomedical data domains. We further propose a pipeline for biclustering data analysis and discuss practical aspects of incorporating biclustering in real-world applications. We highlight prominent application domains, particularly in bioinformatics, and identify typical biclusters to illustrate the analysis output. Moreover, we discuss important aspects to consider when choosing, applying, and evaluating a biclustering algorithm. We also relate biclustering with other data mining tasks (clustering, pattern mining, classification, triclustering, N-way clustering, and graph mining). Thus, it provides theoretical and practical guidance on biclustering data analysis, demonstrating its potential to uncover actionable insights from complex datasets.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11247412 | PMC |
http://dx.doi.org/10.1093/bib/bbae342 | DOI Listing |
World Neurosurg
January 2025
Department of Orthopaedic Surgery, the Bozhou Hospital Affiliated to Anhui Medical University, Bozhou, Anhui, China. Electronic address:
Background: Acute spinal cord injury causes severe motor and sensory dysfunction, significantly burdening individuals and society. This study uses bibliometric analysis to identify research trends and key areas, providing insights for future advancements in treatment.
Methods: Scientific publications on acute spinal cord injury were collected from PubMed and the Web of Science Core Collection (WoSCC) between 2000 and 2022.
Brief Bioinform
November 2024
Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Circuito Escolar, Ciudad Universitaria, 04510 Mexico city, México.
Analyzing gene expression data helps the identification of significant biological relationships in genes. With a growing number of open biological datasets available, it is paramount to use reliable and innovative methods to perform in-depth analyses of biological data and ensure that informed decisions are made based on accurate information. Evolutionary algorithms have been successful in the analysis of biological datasets.
View Article and Find Full Text PDFClin Breast Cancer
November 2024
Massachusetts College of Pharmacy and Health Sciences, Worcester, Massachusetts. Electronic address:
Background: There are documented differences in Breast cancer (BrCA) presentations and outcomes between Black and White patients. In addition to molecular factors, socioeconomic, racial, and clinical factors result in disparities in outcomes for women in the United States. Using machine learning and unsupervised biclustering methods within a multiomics framework, here we sought to shed light on the biological and clinical underpinnings of observed differences between Black and White BrCA patients.
View Article and Find Full Text PDFStat Comput
December 2024
Department of Statistics, Penn State University, Joab L. Thomas Building, University Park, 16802 PA USA.
Unlabelled: Motif discovery is gaining increasing attention in the domain of functional data analysis. Functional motifs are typical "shapes" or "patterns" that recur multiple times in different portions of a single curve and/or in misaligned portions of multiple curves. In this paper, we define functional motifs using an additive model and we propose for their discovery and evaluation.
View Article and Find Full Text PDFbioRxiv
November 2024
Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO 80045, USA; Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.
The growing availability of genome-wide association studies (GWAS) and large-scale biobanks provides an unprecedented opportunity to explore the genetic basis of complex traits and diseases. However, with this vast amount of data comes the challenge of interpreting numerous associations across thousands of traits, especially given the high polygenicity and pleiotropy underlying complex phenotypes. Traditional clustering methods, which identify global patterns in data, lack the resolution to capture overlapping associations relevant to subsets of traits or genes.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!