Hierarchical clustering of bipartite data sets based on the statistical significance of coincidences.

Phys Rev E

Grupo Interdisciplinar de Sistemas Complejos (GISC), Departamento de Matemáticas de la Universidad Carlos III de Madrid, Leganés, Spain.

Published: October 2020

When some 'entities' are related by the 'features' they share they are amenable to a bipartite network representation. Plant-pollinator ecological communities, co-authorship of scientific papers, customers and purchases, or answers in a poll, are but a few examples. Analyzing clustering of such entities in the network is a useful tool with applications in many fields, like internet technology, recommender systems, or detection of diseases. The algorithms most widely applied to find clusters in bipartite networks are variants of modularity optimization. Here, we provide a hierarchical clustering algorithm based on a dissimilarity between entities that quantifies the probability that the features shared by two entities are due to mere chance. The algorithm performance is O(n^{2}) when applied to a set of n entities, and its outcome is a dendrogram exhibiting the connections of those entities. Through the introduction of a 'susceptibility' measure we can provide an 'optimal' choice for the clustering as well as quantify its quality. The dendrogram reveals further useful structural information though-like the existence of subclusters within clusters or of nodes that do not fit in any cluster. We illustrate the algorithm by applying it first to a set of synthetic networks, and then to a selection of examples. We also illustrate how to transform our algorithm into a valid alternative for one-mode networks as well, and show that it performs at least as well as the standard, modularity-based algorithms-with a higher numerical performance. We provide an implementation of the algorithm in python freely accessible from GitHub.

Download full-text PDF

Source
http://dx.doi.org/10.1103/PhysRevE.102.042304DOI Listing

Publication Analysis

Top Keywords

hierarchical clustering
8
entities
5
algorithm
5
clustering bipartite
4
bipartite data
4
data sets
4
sets based
4
based statistical
4
statistical significance
4
significance coincidences
4

Similar Publications

This study has developed a pressure sensor array based on four functionalized DNA-nanoenzymes with catalase-like activity for multiple detections of foodborne pathogens through a portable pressure manometer. Benefiting from functionalization of 4-mercaptophenylboronic acid and β-mercaptoethylamine, the diversity of nonspecific interactions between four DNA-nanoenzymes and each of the nine bacteria leads to differences in pressure response patterns by catalyzing HO to generate exclusive "fingerprints". As effective statistical tools for processing multivariate data, principal component analysis and hierarchical clustering analysis are employed to identify nine foodborne pathogens by analyzing pressure response patterns.

View Article and Find Full Text PDF

African mole-rats (Bathyergidae, Rodentia) are subterranean rodents that live in extensive dark underground tunnel systems and rarely emerge aboveground. They can discriminate between light and dark but show no overt visually driven behaviours except for light-avoidance responses. Their eyes and central visual system are strongly reduced but not degenerated.

View Article and Find Full Text PDF

Influence of Axial Rotation Between the Femoral Neck and Ankle Joint on Kinematics in Normal Knees: A Cross-Sectional Study.

J Am Acad Orthop Surg Glob Res Rev

January 2025

From the Department of Orthopedic Surgery, Faculty of Medicine, The University of Tokyo, Bunkyo, Tokyo (Dr. Kono, Dr. Taketomi, Dr. Kage, Dr. Inui, and Dr. Tanaka); the Department of Information Systems, Faculty of Engineering, Saitama Institute of Technology, Fukaya, Saitama (Dr. Yamazaki); the Department of Orthopedic Biomaterial Science, Osaka University Graduate School of Medicine, Suita, Osaka (Dr. Tamaki, and Dr. Tomita); the Department of Orthopedic Surgery, Saitama Medical University, Saitama Medical Center, Kawagoe, Saitama (Dr. Inui); and the Department of Health Science, Graduate School of Health Science, Morinomiya University of Medical Sciences, Suminoe, Osaka, Japan (Dr. Tomita).

Background: The effect of axial rotation between the femoral neck and ankle joint (total rotation [TR]) on normal knees is unknown. Therefore, this study aimed to investigate the TR effect on normal knee kinematics.

Methods: Volunteers were divided into groups large (L), intermediate (I), and small (S), using hierarchical cluster analysis based on TR in the standing position.

View Article and Find Full Text PDF

Owing to its topographic variations, Ethiopia is a biodiversity-rich country. However, the long-term degradation of resources has resulted in isolated forest patches largely around sacred places. Thus, this work was aimed to evaluate the plant community formation and structural dynamics of the Abraham Sacred Forest patch.

View Article and Find Full Text PDF

Objectives: The purpose of the present study was to investigate the differential impact of disease activity and severity on functional status and patient satisfaction in rheumatoid arthritis (RA) using cluster analysis on data from the FRANK registry.

Methods: Data from 3,619 RA patients in the FRANK registry were analysed. Patients were grouped using hierarchical and k-means cluster analyses based on age, physician's global assessment (PhGA), patient's pain assessment (PtPA), and Steinbrocker stage.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!