An Efficient RI-MP2 Algorithm for Distributed Many-GPU Architectures.

J Chem Theory Comput

School of Computing and Information Systems, University of Melbourne, Melbourne 3010, Australia.

Published: November 2024

Second-order Møller-Plesset perturbation theory (MP2) using the Resolution of the Identity approximation (RI-MP2) is a widely used method for computing molecular energies beyond the Hartree-Fock mean-field approximation. However, its high computational cost and lack of efficient algorithms for modern supercomputing architectures limit its applicability to large molecules. In this paper, we present the first distributed-memory many-GPU RI-MP2 algorithm explicitly designed to utilize hundreds of GPU accelerators for every step of the computation. Our novel algorithm achieves near-peak performance on GPU-based supercomputers through the development of a distributed memory algorithm for forming RI-MP2 intermediate tensors with zero internode communication, except for a single asynchronous broadcast, and a distributed memory algorithm for the energy reduction step, capable of sustaining near-peak performance on clusters with several hundred GPUs. Comparative analysis shows our implementation outperforms state-of-the-art quantum chemistry software by over 3.5 times in speed while achieving an 8-fold reduction in computational power consumption. Benchmarking on the Perlmutter supercomputer, our algorithm achieves 11.8 PFLOP/s (83% of peak performance) performing and the RI-MP2 energy calculation on a 314-water cluster with 7850 primary and 30,144 auxiliary basis functions in 4 min on 180 nodes and 720 A100 GPUs. This performance represents a substantial improvement over traditional CPU-based methods, demonstrating significant time-to-solution and power consumption benefits of leveraging modern GPU-accelerated computing environments for quantum chemistry calculations.

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jctc.4c00814DOI Listing

Publication Analysis

Top Keywords

ri-mp2 algorithm
8
algorithm achieves
8
near-peak performance
8
distributed memory
8
memory algorithm
8
quantum chemistry
8
power consumption
8
algorithm
6
efficient ri-mp2
4
algorithm distributed
4

Similar Publications

An Efficient RI-MP2 Algorithm for Distributed Many-GPU Architectures.

J Chem Theory Comput

November 2024

School of Computing and Information Systems, University of Melbourne, Melbourne 3010, Australia.

Second-order Møller-Plesset perturbation theory (MP2) using the Resolution of the Identity approximation (RI-MP2) is a widely used method for computing molecular energies beyond the Hartree-Fock mean-field approximation. However, its high computational cost and lack of efficient algorithms for modern supercomputing architectures limit its applicability to large molecules. In this paper, we present the first distributed-memory many-GPU RI-MP2 algorithm explicitly designed to utilize hundreds of GPU accelerators for every step of the computation.

View Article and Find Full Text PDF

We present a linear scaling atomic orbital based algorithm for the computation of the most expensive exchange-type RI-MP2-F12 term by employing numerical quadrature in combination with CABS-RI to avoid six-center-three-electron integrals. Furthermore, a robust distance-dependent integral screening scheme, based on integral partition bounds [Thompson, T. H.

View Article and Find Full Text PDF

Analytical Gradient Using Cluster-in-Molecule RI-MP2 Method for the Geometry Optimizations of Large Systems.

J Chem Theory Comput

May 2024

Key Laboratory of Mesoscopic Chemistry of Ministry of Education, New Cornerstone Science Laboratory, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing, Jiangsu 210023, P. R. China.

We present an efficient analytical energy gradient algorithm for the cluster-in-molecule resolution-of-identity second-order Møller-Plesset perturbation (CIM-RI-MP2) method based on the Lagrange multiplier method. Our algorithm independently constructs the Lagrangian formalism within each cluster, avoiding the solution of the coupled-perturbed Hartree-Fock (CPHF) equation for the whole system. Due to this feature, the computational cost of the CIM-RI-MP2 gradients is much lower than that of other local MP2 algorithms.

View Article and Find Full Text PDF

This article presents a novel algorithm for the calculation of analytic energy gradients from second-order Møller-Plesset perturbation theory within the Resolution-of-the-Identity approximation (RI-MP2), which is designed to achieve high performance on clusters with multiple graphical processing units (GPUs). The algorithm uses GPUs for all major steps of the calculation, including integral generation, formation of all required intermediate tensors, solution of the Z-vector equation and gradient accumulation. The implementation in the EXtreme Scale Electronic Structure System (EXESS) software package includes a tailored, highly efficient, multistream scheduling system to hide CPU-GPU data transfer latencies and allows nodes with 8 A100 GPUs to operate at over 80% of theoretical peak floating-point performance.

View Article and Find Full Text PDF

Electronic structure calculations have the potential to predict key matter transformations for applications of strategic technological importance, from drug discovery to material science and catalysis. However, a predictive physicochemical characterization of these processes often requires accurate quantum chemical modeling of complex molecular systems with hundreds to thousands of atoms. Due to the computationally demanding nature of electronic structure calculations and the complexity of modern high-performance computing hardware, quantum chemistry software has historically failed to operate at such large molecular scales with accuracy and speed that are useful in practice.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!