Subgraph matching algorithms are designed to find all instances of predefined subgraphs in a large graph or network and play an important role in the discovery and analysis of so-called network motifs, subgraph patterns which occur more often than expected by chance. We present the index-based subgraph matching algorithm (ISMA), a novel tree-based algorithm. ISMA realizes a speedup compared to existing algorithms by carefully selecting the order in which the nodes of a query subgraph are investigated. In order to achieve this, we developed a number of data structures and maximally exploited symmetry characteristics of the subgraph. We compared ISMA to a naive recursive tree-based algorithm and to a number of well-known subgraph matching algorithms. Our algorithm outperforms the other algorithms, especially on large networks and with large query subgraphs. An implementation of ISMA in Java is freely available at http://sourceforge.net/projects/isma/.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3631255PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0061183PLOS

Publication Analysis

Top Keywords

subgraph matching
16
algorithm isma
12
index-based subgraph
8
matching algorithm
8
large networks
8
matching algorithms
8
tree-based algorithm
8
subgraph
7
algorithm
5
isma
5

Similar Publications

Genome-wide association studies are enriched for interacting genes.

BioData Min

January 2025

The Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, 90069, USA.

Background: With recent advances in single cell technology, high-throughput methods provide unique insight into disease mechanisms and more importantly, cell type origin. Here, we used multi-omics data to understand how genetic variants from genome-wide association studies influence development of disease. We show in principle how to use genetic algorithms with normal, matching pairs of single-nucleus RNA- and ATAC-seq, genome annotations, and protein-protein interaction data to describe the genes and cell types collectively and their contribution to increased risk.

View Article and Find Full Text PDF

Background: With recent advances in single cell technology, high-throughput methods provide unique insight into disease mechanisms and more importantly, cell type origin. Here, we used multi-omics data to understand how genetic variants from genome-wide association studies influence development of disease. We show in principle how to use genetic algorithms with normal, matching pairs of single-nucleus RNA- and ATAC-seq, genome annotations, and protein-protein interaction data to describe the genes and cell types collectively and their contribution to increased risk.

View Article and Find Full Text PDF

Rapid Mining of Fast Ion Conductors via Subgraph Isomorphism Matching.

J Am Chem Soc

July 2024

School of Advanced Materials, Peking University, Shenzhen Graduate School, Shenzhen 518055, P. R. China.

The rapidly evolving field of inorganic solid-state electrolytes (ISSEs) has been driven in recent years by advances in data-mining techniques, which facilitates the high-throughput computational screening for candidate materials in the databases. The key to the mining process is the selection of critical features that underline the similarity of a material to an existing ISSE. Unfortunately, this selection is generally subjective and frequently under debate.

View Article and Find Full Text PDF

Semantic units: organizing knowledge graphs into semantically meaningful units of representation.

J Biomed Semantics

May 2024

Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, 4700 KAUST, 23955, Thuwal, Saudi Arabia.

Background: In today's landscape of data management, the importance of knowledge graphs and ontologies is escalating as critical mechanisms aligned with the FAIR Guiding Principles-ensuring data and metadata are Findable, Accessible, Interoperable, and Reusable. We discuss three challenges that may hinder the effective exploitation of the full potential of FAIR knowledge graphs.

Results: We introduce "semantic units" as a conceptual solution, although currently exemplified only in a limited prototype.

View Article and Find Full Text PDF

As a knowledge representation method, knowledge graph is widely used in intelligent question answering systems and recommendation systems. At present, the research on knowledge graph mainly focuses on information query and retrieval based on knowledge graph. In some domain knowledge graphs, specific subgraph structures (patterns) have specific physical meanings.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!