Exploring sparsity in graph transformers.

Neural Netw

School of Computer Science, Wuhan University, Wuhan, China. Electronic address:

Published: June 2024

Graph Transformers (GTs) have achieved impressive results on various graph-related tasks. However, the huge computational cost of GTs hinders their deployment and application, especially in resource-constrained environments. Therefore, in this paper, we explore the feasibility of sparsifying GTs, a significant yet under-explored topic. We first discuss the redundancy of GTs based on the characteristics of existing GT models, and then propose a comprehensive Graph Transformer SParsification (GTSP) framework that helps to reduce the computational complexity of GTs from four dimensions: the input graph data, attention heads, model layers, and model weights. Specifically, GTSP designs differentiable masks for each individual compressible component, enabling effective end-to-end pruning. We examine our GTSP through extensive experiments on prominent GTs, including GraphTrans, Graphormer, and GraphGPS. The experimental results demonstrate that GTSP effectively reduces computational costs, with only marginal decreases in accuracy or, in some instances, even improvements. For example, GTSP results in a 30% reduction in Floating Point Operations while contributing to a 1.8% increase in Area Under the Curve accuracy on the OGBG-HIV dataset. Furthermore, we provide several insights on the characteristics of attention heads and the behavior of attention mechanisms, all of which have immense potential to inspire future research endeavors in this domain. Our code is available at https://github.com/LiuChuang0059/GTSP.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.neunet.2024.106265DOI Listing

Publication Analysis

Top Keywords

graph transformers
8
attention heads
8
gts
6
gtsp
5
exploring sparsity
4
graph
4
sparsity graph
4
transformers graph
4
transformers gts
4
gts achieved
4

Similar Publications

Simulating large molecular systems over long timescales requires force fields that are both accurate and efficient. In recent years, E(3) equivariant neural networks have lifted the tension between computational efficiency and accuracy of force fields, but they are still several orders of magnitude more expensive than established molecular mechanics (MM) force fields. Here, we propose Grappa, a machine learning framework to predict MM parameters from the molecular graph, employing a graph attentional neural network and a transformer with symmetry-preserving positional encoding.

View Article and Find Full Text PDF

SecEdge: A novel deep learning framework for real-time cybersecurity in mobile IoT environments.

Heliyon

January 2025

Department of Natural and Engineering Sciences, College of Applied Studies and Community Services, King Saud University, Riyadh, 11633, Saudi Arabia.

The rapid growth of Internet of Things (IoT) devices presents significant cybersecurity challenges due to their diverse and resource-constrained nature. Existing security solutions often fall short in addressing the dynamic and distributed environments of IoT systems. This study aims to propose a novel deep learning framework, SecEdge, designed to enhance real-time cybersecurity in mobile IoT environments.

View Article and Find Full Text PDF

Motivation: Drug-target interaction (DTI) prediction is crucial for drug discovery, significantly reducing costs and time in experimental searches across vast drug compound spaces. While deep learning has advanced DTI prediction accuracy, challenges remain: (i) existing methods often lack generalizability, with performance dropping significantly on unseen proteins and cross-domain settings; (ii) current molecular relational learning often overlooks subpocket-level interactions, which are vital for a detailed understanding of binding sites.

Results: We introduce SP-DTI, a subpocket-informed transformer model designed to address these challenges through: (i) detailed subpocket analysis using the Cavity Identification and Analysis Routine (CAVIAR) for interaction modeling at both global and local levels, and (ii) integration of pre-trained language models into graph neural networks to encode drugs and proteins, enhancing generalizability to unlabeled data.

View Article and Find Full Text PDF

Graph representation learning has been leveraged to identify cancer genes from biological networks. However, its applicability is limited by insufficient interpretability and generalizability under integrative network analysis. Here we report the development of an interpretable and generalizable transformer-based model that accurately predicts cancer genes by leveraging graph representation learning and the integration of multi-omics data with the topologies of homogeneous and heterogeneous networks of biological interactions.

View Article and Find Full Text PDF

Human mobility between different regions is a major factor in large-scale outbreaks of infectious diseases. Deep learning models incorporating infectious disease transmission dynamics for predicting the spread of multi-regional outbreaks due to human mobility have become a hot research topic. In this study, we incorporate the Graph Transformer Neural Network and graph learning mechanisms into a metapopulation SIR model to build a hybrid framework, Metapopulation Graph Transformer Neural Network (M-Graphormer), for high-dimensional parameter estimation and multi-regional epidemic prediction.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!