Scaling analysis of affinity propagation.

Phys Rev E Stat Nonlin Soft Matter Phys

INRIA-Saclay, F-91405 Orsay, France.

Published: June 2010

We analyze and exploit some scaling properties of the affinity propagation (AP) clustering algorithm proposed by Frey and Dueck [Science 315, 972 (2007)]. Following a divide and conquer strategy we setup an exact renormalization-based approach to address the question of clustering consistency, in particular, how many cluster are present in a given data set. We first observe that the divide and conquer strategy, used on a large data set hierarchically reduces the complexity O(N2) to O(N((h+2)/(h+1))) , for a data set of size N and a depth h of the hierarchical strategy. For a data set embedded in a d -dimensional space, we show that this is obtained without notably damaging the precision except in dimension d=2 . In fact, for d larger than 2 the relative loss in precision scales such as N((2-d)/(h+1)d). Finally, under some conditions we observe that there is a value s* of the penalty coefficient, a free parameter used to fix the number of clusters, which separates a fragmentation phase (for ss*) of the underlying hidden cluster structure. At this precise point holds a self-similarity property which can be exploited by the hierarchical strategy to actually locate its position, as a result of an exact decimation procedure. From this observation, a strategy based on AP can be defined to find out how many clusters are present in a given data set.

Download full-text PDF

Source
http://dx.doi.org/10.1103/PhysRevE.81.066102DOI Listing

Publication Analysis

Top Keywords

data set
20
affinity propagation
8
divide conquer
8
conquer strategy
8
hierarchical strategy
8
strategy
5
data
5
set
5
scaling analysis
4
analysis affinity
4

Similar Publications

The evolutionary history underlying gradients in species richness is still subject to discussions and understanding the past niche evolution might be crucial in estimating the potential of taxa to adapt to changing environmental conditions. In this study we intend to contribute to elucidation of the evolutionary history of liverwort species richness distributions along elevational gradients at a global scale. For this purpose, we linked a comprehensive data set of genus occurrences on mountains worldwide with a time-calibrated phylogeny of liverworts and estimated mean diversification rates (DivElev) and mean ages (AgeElev) of the respective genera per elevational band.

View Article and Find Full Text PDF

De novo transcriptome assembly of the Perna viridis: A novel invertebrate model for ecotoxicological studies.

Sci Data

January 2025

Marine Biotechnology Fish Nutrition and Health Division, Central Marine Fisheries Research Institute, Post Box No 1603 Ernakulam North PO., Kochi, 682018, Kerala, India.

Mussels, particularly Perna viridis, are vital sentinel species for toxicology and biomonitoring in environmental health. This species plays a crucial role in aquaculture and significantly impacts the fisheries sector. Despite the ecological and economic importance of this species, its omics resources are still scarce.

View Article and Find Full Text PDF

Black carp (Mylopharyngodon piceus) is one of the "four famous domestic fishes" in China and an important economic fish in freshwater aquaculture. A high-quality genome is essential for advancing future biological research and breeding programs for this species. In this study, we aimed to generate a high-quality chromosome-level genome assembly of black carp using Nanopore and Hi-C technologies.

View Article and Find Full Text PDF

The characteristics of data produced by omics technologies are pivotal, as they critically influence the feasibility and effectiveness of computational methods applied in downstream analyses, such as data harmonization and differential abundance analyses. Furthermore, variability in these data characteristics across datasets plays a crucial role, leading to diverging outcomes in benchmarking studies, which are essential for guiding the selection of appropriate analysis methods in all omics fields. Additionally, downstream analysis tools are often developed and applied within specific omics communities due to the presumed differences in data characteristics attributed to each omics technology.

View Article and Find Full Text PDF

Ultrasound is a primary diagnostic tool commonly used to evaluate internal body structures, including organs, blood vessels, the musculoskeletal system, and fetal development. Due to challenges such as operator dependence, noise, limited field of view, difficulty in imaging through bone and air, and variability across different systems, diagnosing abnormalities in ultrasound images is particularly challenging for less experienced clinicians. The development of artificial intelligence (AI) technology could assist in the diagnosis of ultrasound images.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!