Background: Hierarchical clustering methods like Ward's method have been used since decades to understand biological and chemical data sets. In order to get a partition of the data set, it is necessary to choose an optimal level of the hierarchy by a so-called level selection algorithm. In 2005, a new kind of hierarchical clustering method was introduced by Palla et al. that differs in two ways from Ward's method: it can be used on data on which no full similarity matrix is defined and it can produce overlapping clusters, i.e., allow for multiple membership of items in clusters. These features are optimal for biological and chemical data sets but until now no level selection algorithm has been published for this method.

Results: In this article we provide a general selection scheme, the level independent clustering selection method, called LInCS. With it, clusters can be selected from any level in quadratic time with respect to the number of clusters. Since hierarchically clustered data is not necessarily associated with a similarity measure, the selection is based on a graph theoretic notion of cohesive clusters. We present results of our method on two data sets, a set of drug like molecules and set of protein-protein interaction (PPI) data. In both cases the method provides a clustering with very good sensitivity and specificity values according to a given reference clustering. Moreover, we can show for the PPI data set that our graph theoretic cohesiveness measure indeed chooses biologically homogeneous clusters and disregards inhomogeneous ones in most cases. We finally discuss how the method can be generalized to other hierarchical clustering methods to allow for a level independent cluster selection.

Conclusion: Using our new cluster selection method together with the method by Palla et al. provides a new interesting clustering mechanism that allows to compute overlapping clusters, which is especially valuable for biological and chemical data sets.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2774311PMC
http://dx.doi.org/10.1186/1748-7188-4-12DOI Listing

Publication Analysis

Top Keywords

hierarchical clustering
16
data sets
16
clustering methods
12
biological chemical
12
chemical data
12
method
9
data
9
cluster selection
8
clustering
8
ward's method
8

Similar Publications

Segmentation by recreation experiences of demand in coastal and marine destinations: A study in Galapagos, Ecuador.

PLoS One

January 2025

Facultad de Ciencias Sociales y Humanísticas, Escuela Superior Politécnica del Litoral, ESPOL, Guayaquil, Ecuador.

Tourism in coastal and marine areas offers a wide variety of recreational activities. The present study had the following objectives: (i), identify the dimensions of recreational experiences in coastal and marine destinations focused on island marine protected areas (ii) determine the demand segments for recreational experiences, and (iii) establish the relationship between the demand segments for recreational experiences and the satisfaction and loyalty. The study was conducted in the Galápagos Islands of Ecuador, and 407 valid questionnaires were collected on-site.

View Article and Find Full Text PDF

Background: Latina women in the United States experience intimate partner violence (IPV) at high rates, but evidence suggests Latinas seek help for IPV at lower rates than other communities. Safety planning is an approach that provides those experiencing IPV with concrete actions to increase their safety and referrals to formal services. While safety planning is shown to reduce future incidences of violence, little is known about the safety planning priorities of Latinas.

View Article and Find Full Text PDF

Purpose: Fuchs endothelial corneal dystrophy (FECD) displays a higher incidence in females than in males, yet the underlying mechanism remains unclear. This study aimed to elucidate sex-dependent differential gene expressions in corneal endothelial cells (CECs) from healthy non-FECD individuals and from patients with FECD.

Methods: RNA-Seq data from CECs of non-FECD subjects (3 males, 4 females) and FECD subjects (5 males, 5 females) were analyzed to identify differentially expressed genes (DEGs) between the sexes.

View Article and Find Full Text PDF

Background: Given the demographic shift towards an older population, it is crucial to understand the determinants and implications of psychological distress among this demographic group. This study explores the relationship between health self-management and psychological distress in older Chinese adults, focusing on the mediating role of general self-efficacy (GSE).

Methods: We conducted a cross-sectional study in five districts of Shanghai from 1 November to 31 December 2021, employing a random cluster sampling method.

View Article and Find Full Text PDF

Background: The use of telehealth has grown exponentially since the COVID-19 pandemic. The safety of an initial telehealth preoperative evaluation for patients undergoing general thoracic procedures is unknown.

Methods: We retrospectively reviewed all patients who underwent a general thoracic surgical procedure at our academic tertiary care institution from January 2021 to December 2022.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!