J Chem Theory Comput
December 2024
This Article presents two optimized multi-GPU algorithms for Fock matrix construction, building on the work of Ufimtsev and Martinez [ 2009, 5, 1004-1015] and Barca et al. [ 2021, 17, 7486-7503]. The novel algorithms, opt-UM and opt-Brc, introduce significant enhancements, including improved integral screening, exploitation of sparsity and symmetry, a linear scaling exchange matrix assembly algorithm, and extended capabilities for Hartree-Fock caculations up to -type angular momentum functions.
View Article and Find Full Text PDFJ Chem Theory Comput
September 2024
This article presents an optimized algorithm and implementation for calculating resolution-of-the-identity Hartree-Fock (RI-HF) energies and analytic gradients using multiple graphics processing units (GPUs). The algorithm is especially designed for high throughput ab initio molecular dynamics simulations of small and medium size molecules (10-100 atoms). Key innovations of this work include the exploitation of multi-GPU parallelism and a workload balancing scheme that efficiently distributes computational tasks among GPUs.
View Article and Find Full Text PDFThis article presents a novel algorithm for the calculation of analytic energy gradients from second-order Møller-Plesset perturbation theory within the Resolution-of-the-Identity approximation (RI-MP2), which is designed to achieve high performance on clusters with multiple graphical processing units (GPUs). The algorithm uses GPUs for all major steps of the calculation, including integral generation, formation of all required intermediate tensors, solution of the Z-vector equation and gradient accumulation. The implementation in the EXtreme Scale Electronic Structure System (EXESS) software package includes a tailored, highly efficient, multistream scheduling system to hide CPU-GPU data transfer latencies and allows nodes with 8 A100 GPUs to operate at over 80% of theoretical peak floating-point performance.
View Article and Find Full Text PDF