In this paper we propose an efficient reformulation of a Markov clustering algorithm, suitable for fast and accurate grouping of protein sequences, based on pairwise similarity information. The proposed modification consists of optimal reordering of rows and columns in the similarity matrix after every iteration, transforming it into a matrix with several compact blocks along the diagonal, and zero similarities outside the blocks. These blocks are treated separately in later iterations, thus reducing the computational burden of the algorithm. The proposed algorithm was tested on protein sequence databases like SCOP95. In terms of efficiency, the proposed solution achieves a speed-up factor in the range 15-50 compared to the conventional Markov clustering, depending on input data size and parameter settings. This improvement in computation time is reached without losing anything from the partition accuracy. The convergence is usually reached in 40-50 iterations. Combining the proposed method with sparse matrix representation and parallel execution will certainly lead to a significantly more efficient solution in future.

Download full-text PDF

Source
http://dx.doi.org/10.1109/EMBC.2013.6609581DOI Listing

Publication Analysis

Top Keywords

markov clustering
12
clustering algorithm
8
protein sequence
8
efficient markov
4
algorithm
4
algorithm protein
4
sequence grouping
4
grouping paper
4
paper propose
4
propose efficient
4

Similar Publications

Task Offloading with LLM-Enhanced Multi-Agent Reinforcement Learning in UAV-Assisted Edge Computing.

Sensors (Basel)

December 2024

School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400044, China.

Unmanned aerial vehicles (UAVs) furnished with computational servers enable user equipment (UE) to offload complex computational tasks, thereby addressing the limitations of edge computing in remote or resource-constrained environments. The application of value decomposition algorithms for UAV trajectory planning has drawn considerable research attention. However, existing value decomposition algorithms commonly encounter obstacles in effectively associating local observations with the global state of UAV clusters, which hinders their task-solving capabilities and gives rise to reduced task completion rates and prolonged convergence times.

View Article and Find Full Text PDF

Paratuberculosis (Johne's disease), caused by Mycobacterium avium subsp. paratuberculosis (MAP), is a common, economically-important and potentially zoonotic contagious disease of cattle, with worldwide distribution. Disease management relies on identification of animals which are at high-risk of being infected or infectious.

View Article and Find Full Text PDF

Numerous algorithms have been proposed to infer the underlying structure of the social networks via observed information propagation. The previously proposed algorithms concentrate on inferring accurate links and neglect preserving the essential topological properties of the underlying social networks. In this paper, we propose a novel method called DANI to infer the underlying network while preserving its structural properties.

View Article and Find Full Text PDF

We consider the problem of clustering grouped data with possibly non-exchangeable groups whose dependencies can be characterized by a known directed acyclic graph. To allow the sharing of clusters among the non-exchangeable groups, we propose a Bayesian nonparametric approach, termed graphical Dirichlet process, that jointly models the dependent group-specific random measures by assuming each random measure to be distributed as a Dirichlet process whose concentration parameter and base probability measure depend on those of its parent groups. The resulting joint stochastic process respects the Markov property of the directed acyclic graph that links the groups.

View Article and Find Full Text PDF

Targeting IKZF1 via HDAC1: Combating Acute Myeloid Leukemia.

Integr Biol (Camb)

January 2024

Department of Biotechnology, School of Bioscience and Technology (SBST), Vellore Institute of Technology (VIT), Vellore District, Tamil Nadu State, 632014, India.

Acute myeloid leukemia (AML) accounts for 1.3% of all cancers, with a limited survival of only 30%, and treating AML is a continuous challenge in medicine. IKZF1 is a DNA-binding protein that is highly mutated and undruggable but significant in causing AML.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!