Scalable training of graph convolutional neural networks for fast and accurate predictions of HOMO-LUMO gap in molecules.

J Cheminform

Computational Sciences and Engineering Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, TN, 37831, USA.

Published: October 2022

Graph Convolutional Neural Network (GCNN) is a popular class of deep learning (DL) models in material science to predict material properties from the graph representation of molecular structures. Training an accurate and comprehensive GCNN surrogate for molecular design requires large-scale graph datasets and is usually a time-consuming process. Recent advances in GPUs and distributed computing open a path to reduce the computational cost for GCNN training effectively. However, efficient utilization of high performance computing (HPC) resources for training requires simultaneously optimizing large-scale data management and scalable stochastic batched optimization techniques. In this work, we focus on building GCNN models on HPC systems to predict material properties of millions of molecules. We use HydraGNN, our in-house library for large-scale GCNN training, leveraging distributed data parallelism in PyTorch. We use ADIOS, a high-performance data management framework for efficient storage and reading of large molecular graph data. We perform parallel training on two open-source large-scale graph datasets to build a GCNN predictor for an important quantum property known as the HOMO-LUMO gap. We measure the scalability, accuracy, and convergence of our approach on two DOE supercomputers: the Summit supercomputer at the Oak Ridge Leadership Computing Facility (OLCF) and the Perlmutter system at the National Energy Research Scientific Computing Center (NERSC). We present our experimental results with HydraGNN showing (i) reduction of data loading time up to 4.2 times compared with a conventional method and (ii) linear scaling performance for training up to 1024 GPUs on both Summit and Perlmutter.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9575242PMC
http://dx.doi.org/10.1186/s13321-022-00652-1DOI Listing

Publication Analysis

Top Keywords

graph convolutional
8
convolutional neural
8
homo-lumo gap
8
predict material
8
material properties
8
large-scale graph
8
graph datasets
8
gcnn training
8
data management
8
graph
6

Similar Publications

A two-level resolution neural network with enhanced interpretability for freeway traffic forecasting.

Sci Rep

December 2024

Department of Civil Engineering, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.

Deep learning models are widely used for traffic forecasting on freeways due to their ability to learn complex temporal and spatial relationships. In particular, graph neural networks, which integrate graph theory into deep learning, have become popular for modeling traffic sensor networks. However, traditional graph convolutional networks (GCNs) face limitations in capturing long-range spatial correlations, which can hinder accurate long-term predictions.

View Article and Find Full Text PDF

Urban mobility prediction is crucial for optimizing resource allocation, managing transportation systems, and planning urban development. We propose a novel framework, GeoTemporal LSTM (GT-LSTM), designed to address the intricate spatiotemporal dynamics of urban environments. GT-LSTM integrates temporal dependencies with geographic information through a multi-modal approach that combines attention mechanisms and Recurrent Neural Networks (RNNs).

View Article and Find Full Text PDF

In the contemporary field of life sciences, researchers have gradually recognized the critical role of microbes in maintaining human health. However, traditional biological experimental methods for validating the association between microbes and diseases are both time-consuming and costly. Therefore, developing effective computational methods to predict potential associations between microbes and diseases is an important and urgent task.

View Article and Find Full Text PDF

To improve students' understanding of physical education teaching concepts and help teachers analyze students' cognitive patterns, the study proposes an association learning-based method for understanding physical education teaching concepts using deep learning algorithms, which extracts image features related to teaching concepts using convolutional neural networks. Moreover, a neurocognitive diagnostic model based on hypergraph convolution is constructed to mine the data of students' long-term learning sequences and identify students' cognitive outcomes. The findings revealed that the highest accuracy of the association graph convolutional neural network was 0.

View Article and Find Full Text PDF

Enhanced technologies of the future are gradually improving the digital landscape. Internet of Things (IoT) technology is an advanced technique that is quickly increasing owing to the development of a network of organized online devices. In today's digital era, the IoT is considered one of the most robust technologies.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!