Fast tree aggregation for consensus hierarchical clustering.

BMC Bioinformatics

Université Paris-Saclay, CNRS, INRAE, Univ Evry, Institute of Plant Sciences Paris-Saclay (IPS2), Orsay, 91405, France.

Published: March 2020

Background: In unsupervised learning and clustering, data integration from different sources and types is a difficult question discussed in several research areas. For instance in omics analysis, dozen of clustering methods have been developed in the past decade. When a single source of data is at play, hierarchical clustering (HC) is extremely popular, as a tree structure is highly interpretable and arguably more informative than just a partition of the data. However, applying blindly HC to multiple sources of data raises computational and interpretation issues.

Results: We propose mergeTrees, a method that aggregates a set of trees with the same leaves to create a consensus tree. In our consensus tree, a cluster at height h contains the individuals that are in the same cluster for all the trees at height h. The method is exact and proven to be [Formula: see text], n being the individuals and q being the number of trees to aggregate. Our implementation is extremely effective on simulations, allowing us to process many large trees at a time. We also rely on mergeTrees to perform the cluster analysis of two real -omics data sets, introducing a spectral variant as an efficient and robust by-product.

Conclusions: Our tree aggregation method can be used in conjunction with hierarchical clustering to perform efficient cluster analysis. This approach was found to be robust to the absence of clustering information in some of the data sets as well as an increased variability within true clusters. The method is implemented in R/C++ and available as an R package named mergeTrees, which makes it easy to integrate in existing or new pipelines in several research areas.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7085155PMC
http://dx.doi.org/10.1186/s12859-020-3453-6DOI Listing

Publication Analysis

Top Keywords

hierarchical clustering
12
tree aggregation
8
clustering data
8
consensus tree
8
cluster analysis
8
data sets
8
clustering
6
data
6
fast tree
4
aggregation consensus
4

Similar Publications

Background: Little is known about the mediating role of nasal microbiome on the association between pre- and postnatal air pollution exposure and subsequent respiratory morbidity in infancy. We aimed to examine the impact of air pollution on microbiome and respiratory symptoms, and whether microbiome mediates the association between air pollution and symptoms.

Methods: Nasal swabs from 270 infants in the prospective Basel-Bern Infant Lung Development cohort were analyzed by 16S ribosomal RNA gene sequencing.

View Article and Find Full Text PDF

The Impact of Public Health and Social Measures (PHSMs) on SARS-CoV-2 Transmission in the WHO European Region (2020-2022).

Influenza Other Respir Viruses

December 2024

Department of Infectious Disease Epidemiology, Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, UK.

Background: Between 2020 and 2022, countries used a range of different public health and social measures (PHSMs) to reduce the transmission of SARS-CoV-2. The impact of these PHSMs varied as the pandemic progressed, variants of concern (VOCs) emerged, vaccines rolled out and acceptance/uptake rates evolved. In this study, we assessed the impact of PHSMs in the World Health Organization (WHO) European Region during VOC phases.

View Article and Find Full Text PDF

Introduction: Treatment with Sunitinib, a potent multitargeted receptor tyrosine kinase inhibitor (TKI) has increased the progression-free survival (PFS) and overall-survival (OS) of patients with metastasized renal cell carcinoma (mRCC). With modest OS improvement and variable response and toxicity predictive and/or prognostic biomarkers are needed to personalize patient management: Prediction of individual TKI therapy response and resistance will increase successful treatment outcome while reducing unnecessary drug use and expense. The aim of this study was to investigate whether kinase activity analysis can predict sunitinib response and/or toxicity using tissue samples obtained from primary clear cell RCC (ccRCC) from a cohort of clinically annotated patients with mRCC receiving sunitinib as first-line treatment.

View Article and Find Full Text PDF

Abnormalities of resting-state EEG microstates in older adults with cognitive frailty.

Geroscience

December 2024

School of Nursing, Southern Medical University, No. 1023 Shatai Road (South), Baiyun District, Guangzhou City, Guangdong Province, China.

This study aims to analyze the characteristics of EEG microstates across different cognitive frailty (CF) subtypes, providing insights for the prevention and early diagnosis of CF. This study included 60 eligible older adults. Their resting-state EEG microstates were analyzed using agglomerative adaptive hierarchical clustering.

View Article and Find Full Text PDF

We performed a systematic review of the ictal semiology of temporo-frontal seizures with the aim to summarize the state-of-the-art anatomo-clinical correlations in the field, and help guide the interpretation of ictal semiology within the framework of presurgical evaluation. We conducted the systematic review and meta-analysis, and reported its results according to the Preferred Reporting Items for Systematic Review and Meta-Analysis statement. We searched electronic databases (Scopus, PUBMED, Web of Science, and EMBASE) using relevant keywords related to temporal, frontal and sublobar structures, semiology, and electroencephalography/stereoelectroencephalography exploration.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!