Frequent subgraph mining (FSM) is an essential and challenging graph mining task used in several applications of the modern data science. Some of the FSM algorithms have the objective of finding all frequent subgraphs whereas some of the algorithms focus on discovering frequent subgraphs approximately. On the other hand, modern applications employ evolving graphs where the increments are small graphs or stream of nodes and edges. In such cases, FSM task becomes more challenging due to growing data size and complexity of the base algorithms. Recently we see frequent subgraph mining algorithms designed for dynamic graph data. However, there is no comparative review of the dynamic subgraph mining algorithms focusing on the discovery of frequent subgraphs over evolving graph data. This article focuses on the characteristics of dynamic frequent subgraph mining algorithms over evolving graphs. We first introduce and compare dynamic frequent subgraph mining algorithms; trying to highlight their attributes as increment type, graph type, graph representation, internal data structure, algorithmic approach, programming approach, base algorithm and output type. Secondly, we introduce and compare the approximate frequent subgraph mining algorithms for dynamic graphs with additional attributes as their sampling strategy, data in the sample, statistical guarantees on the sample and their main objective. Finally, we highlight research opportunities in this specific domain from our perspective. Overall, we aim to introduce the research area of frequent subgraph mining over evolving graphs with the hope that this can serve as a reference and inspiration for the researchers of the field.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11622988 | PMC |
http://dx.doi.org/10.7717/peerj-cs.2361 | DOI Listing |
PeerJ Comput Sci
October 2024
Computer Engineering, Izmir Institute of Technology, Izmir, Turkey.
Frequent subgraph mining (FSM) is an essential and challenging graph mining task used in several applications of the modern data science. Some of the FSM algorithms have the objective of finding all frequent subgraphs whereas some of the algorithms focus on discovering frequent subgraphs approximately. On the other hand, modern applications employ evolving graphs where the increments are small graphs or stream of nodes and edges.
View Article and Find Full Text PDFPeerJ Comput Sci
November 2024
Computer Engineering, Izmir Institute of Technology, Izmir, Turkey.
Clique counting is a crucial task in graph mining, as the count of cliques provides different insights across various domains, social and biological network analysis, community detection, recommendation systems, and fraud detection. Counting cliques is algorithmically challenging due to combinatorial explosion, especially for large datasets and larger clique sizes. There are comprehensive surveys and reviews on algorithms for counting subgraphs and triangles (three-clique), but there is a notable lack of reviews addressing k-clique counting algorithms for k > 3.
View Article and Find Full Text PDFJ Chem Inf Model
December 2024
College of Metallurgy and Energy Engineering, Kunming University of Science and Technology, Kunming 650031, China.
Accurately identifying sites of metabolism (SoM) mediated by cytochrome P450 (CYP) enzymes, which are responsible for drug metabolism in the body, is critical in the early stage of drug discovery and development. Current computational methods for CYP-mediated SoM prediction face several challenges, including limitations to traditional machine learning models at the atomic level, heavy reliance on complex feature engineering, and the lack of interpretability relevant to medicinal chemistry. Here, we propose GraphCySoM, a novel molecule-level modeling approach based on graph neural networks, utilizing lightweight features and interpretable annotations on substructures, to effectively and interpretably predict CYP-mediated SoM.
View Article and Find Full Text PDFSensors (Basel)
November 2024
School of Software Technology, Dalian University of Technology, Dalian 116024, China.
Knowledge graph link prediction is crucial for constructing triples in knowledge graphs, which aim to infer whether there is a relation between the entities. Recently, graph neural networks and contrastive learning have demonstrated superior performance compared with traditional translation-based models; they successfully extracted common features through explicit linking between entities. However, the implicit associations between entities without a linking relationship are ignored, which impedes the model from capturing distant but semantically rich entities.
View Article and Find Full Text PDFBMC Bioinformatics
November 2024
Computer Science and Engineering, Qatar University, Doha, Qatar.
Networks have emerged as a natural data structure to represent relations among entities. Proteins interact to carry out cellular functions and protein-Protein interaction network analysis has been employed for understanding the cellular machinery. Advances in genomics technologies enabled the collection of large data that annotate proteins in interaction networks.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!