We present a hybrid MPI-OpenMP implementation of Linear-Scaling Density Functional Theory within the ONETEP code. We illustrate its performance on a range of high performance computing (HPC) platforms comprising shared-memory nodes with fast interconnect. Our work has focused on applying OpenMP parallelism to the routines which dominate the computational load, attempting where possible to parallelize different loops from those already parallelized within MPI. This includes 3D FFT box operations, sparse matrix algebra operations, calculation of integrals, and Ewald summation. While the underlying numerical methods are unchanged, these developments represent significant changes to the algorithms used within ONETEP to distribute the workload across CPU cores. The new hybrid code exhibits much-improved strong scaling relative to the MPI-only code and permits calculations with a much higher ratio of cores to atoms. These developments result in a significantly shorter time to solution than was possible using MPI alone and facilitate the application of the ONETEP code to systems larger than previously feasible. We illustrate this with benchmark calculations from an amyloid fibril trimer containing 41,907 atoms. We use the code to study the mechanism of delamination of cellulose nanofibrils when undergoing sonification, a process which is controlled by a large number of interactions that collectively determine the structural properties of the fibrils. Many energy evaluations were needed for these simulations, and as these systems comprise up to 21,276 atoms this would not have been feasible without the developments described here.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1021/ct500686r | DOI Listing |
J Chem Phys
August 2024
Center for Computational Quantum Physics, Flatiron Institute, 162 5th Avenue, New York, New York 10010, USA.
We provide a detailed exposition of our computational framework designed for the accurate calculation of real-frequency dynamical correlation functions of the single-impurity Anderson model in the regime of weak to intermediate coupling. Using quantum field theory within the Keldysh formalism to directly access the self-energy and dynamical susceptibilities in real frequencies, as detailed in our recent publication [Ge et al., Phys.
View Article and Find Full Text PDFJ Chem Theory Comput
March 2024
Department of Chemistry, Iowa State University and Ames National Laboratory, Ames, Iowa 50011, United States.
The effective fragment molecular orbital (EFMO) method has been developed to predict the total energy of a very large molecular system accurately (with respect to the underlying quantum mechanical method) and efficiently by taking advantage of the locality of strong chemical interactions and employing a two-level hierarchical parallelism. The accuracy of the EFMO method is partly attributed to the accurate and robust intermolecular interaction prediction between distant fragments, in particular, the many-body polarization and dispersion effects, which require the generation of static and dynamic polarizability tensors by solving the coupled perturbed Hartree-Fock (CPHF) and time-dependent HF (TDHF) equations, respectively. Solving the CPHF and TDHF equations is the main EFMO computational bottleneck due to the inefficient (serial) and I/O-intensive implementation of the CPHF and TDHF solvers.
View Article and Find Full Text PDFJ Chem Theory Comput
January 2024
PDC Center for High Performance Computing, KTH Royal Institute of Technology, Stockholm SE-100 44, Sweden.
We present the implementation of an efficient matrix-folded formalism for the evaluation of complex response functions and the calculation of transition properties at the level of the second-order algebraic-diagrammatic construction (ADC(2)) scheme. The underlying algorithms, in combination with the adopted hybrid MPI/OpenMP parallelization strategy, enabled calculations of the UV/vis spectra of a guanine oligomer series ranging up to 1032 contracted basis functions, thereby utilizing vast computational resources from up to 32,768 CPU cores. Further analysis of the convergence behavior of the involved iterative subspace algorithms revealed the superiority of a frequency-separated treatment of response equations even for a large spectral window, including 101 frequencies.
View Article and Find Full Text PDFJ Biomech Eng
July 2023
Dynaflow, Inc., 10621-J Iron Bridge Road, Jessup, MD 20724.
Microbubble enhanced high intensity focused ultrasound (HIFU) is of great interest to tissue ablation for solid tumor treatments such as in liver and brain cancers, in which contrast agents/microbubbles are injected into the targeted region to promote heating and reduce prefocal tissue damage. A compressible Euler-Lagrange coupled model has been developed to accurately characterize the acoustic and thermal fields during this process. This employs a compressible Navier-Stokes solver for the ultrasound acoustic field and a discrete singularities model for bubble dynamics.
View Article and Find Full Text PDFJ Chem Theory Comput
April 2022
Department of Chemistry and Ames Laboratory, Iowa State University, Ames, Iowa 50011, United States.
In recent years, parallelism via multithreading has become extremely important to the optimization of high-performance electronic structure theory codes. Such multithreading is generally achieved via OpenMP constructs, using a fork-join threading model to enable thread-level data parallelism within the code. An alternative approach to multithreading is , which displays multiple benefits relative to fork-join thread parallelism.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!