Clumppling: cluster matching and permutation program with integer linear programming.

Bioinformatics

Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA 94305, United States.

Published: January 2024

Motivation: In the mixed-membership unsupervised clustering analyses commonly used in population genetics, multiple replicate data analyses can differ in their clustering solutions. Combinatorial algorithms assist in aligning clustering outputs from multiple replicates so that clustering solutions can be interpreted and combined across replicates. Although several algorithms have been introduced, challenges exist in achieving optimal alignments and performing alignments in reasonable computation time.

Results: We present Clumppling, a method for aligning replicate solutions in mixed-membership unsupervised clustering. The method uses integer linear programming for finding optimal alignments, embedding the cluster alignment problem in standard combinatorial optimization frameworks. In example analyses, we find that it achieves solutions with preferred values of a desired objective function relative to those achieved by Pong and that it proceeds with less computation time than Clumpak. It is also the first method to permit alignments across replicates with multiple arbitrary values of the number of clusters K.

Availability And Implementation: Clumppling is available at https://github.com/PopGenClustering/Clumppling.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10766593PMC
http://dx.doi.org/10.1093/bioinformatics/btad751DOI Listing

Publication Analysis

Top Keywords

integer linear
8
linear programming
8
mixed-membership unsupervised
8
unsupervised clustering
8
clustering solutions
8
optimal alignments
8
clustering
5
clumppling cluster
4
cluster matching
4
matching permutation
4

Similar Publications

For public security purposes, distributed surveillance systems are widely deployed in key areas. These systems comprise visual sensors, edge computing boxes, and cloud servers. Resource scheduling algorithms are critical to ensure such systems' robustness and efficiency.

View Article and Find Full Text PDF

The study aims to address challenges encountered by modern industrial enterprises, including inefficient accounting cost calculation, delayed information acquisition, and untimely management decisions. By comprehensively applying modern management, information technology, and cost control methods, this study constructs a real-time cost control model to optimize industrial enterprises. Firstly, the model employs mixed integer linear programming (MILP) to optimize production processes through mathematical modeling.

View Article and Find Full Text PDF

Schedule optimization for chemical library synthesis.

Digit Discov

December 2024

Department of Chemical Engineering, MIT Cambridge MA 02139 USA

Automated chemistry platforms hold the potential to enable large-scale organic synthesis campaigns, such as producing a library of compounds for biological evaluation. The efficiency of such platforms will depend on the schedule according to which the synthesis operations are executed. In this work, we study the scheduling problem for chemical library synthesis, where operations from interdependent synthetic routes are scheduled to minimize the makespan-the total duration of the synthesis campaign.

View Article and Find Full Text PDF

The marginal wells in low-permeability oil fields are characterized by small storage size, scattered distribution, intermittent production, etc. The construction of large-scale gathering pipelines has large investment. So the current production mode is featured by single well tank oil storage, oil tank truck transportation and manual tank truck scheduling.

View Article and Find Full Text PDF

An Innovative Linear Wireless Sensor Network Reliability Evaluation Algorithm.

Sensors (Basel)

January 2025

College of Information Science and Engineering, Shenyang University of Technology, Shenyang 110167, China.

In recent years, wireless sensor networks (WSNs) have become a crucial technology for infrastructure monitoring. To ensure the reliability of monitoring services, evaluating the network's reliability is particularly important. Sensor nodes are distributed linearly when monitoring linear structures, such as railway bridges, forming what is known as a Linear Wireless Sensor Network (LWSN).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!