Efficient Mixed-Precision Matrix Factorization of the Inverse Overlap Matrix in Electronic Structure Calculations with AI-Hardware and GPUs.

J Chem Theory Comput

Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States.

Published: August 2024

In recent years, a new kind of accelerated hardware has gained popularity in the artificial intelligence (AI) community which enables extremely high-performance tensor contractions in reduced precision for deep neural network calculations. In this article, we exploit Nvidia Tensor cores, a prototypical example of such AI-hardware, to develop a mixed precision approach for computing a dense matrix factorization of the inverse overlap matrix in electronic structure theory, . This factorization of , written as = , is used to transform the general matrix eigenvalue problem into a standard matrix eigenvalue problem. Here we present a mixed precision iterative refinement algorithm where is given recursively using matrix-matrix multiplications and can be computed with high performance on Tensor cores. To understand the performance and accuracy of Tensor cores, comparisons are made to GPU-only implementations in single and double precision. Additionally, we propose a nonparametric stopping criteria which is robust in the face of lower precision floating point operations. The algorithm is particularly useful when we have a good initial guess to , for example, from previous time steps in quantum-mechanical molecular dynamics simulations or from a previous iteration in a geometry optimization.

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jctc.4c00584DOI Listing

Publication Analysis

Top Keywords

tensor cores
12
matrix factorization
8
factorization inverse
8
inverse overlap
8
overlap matrix
8
matrix electronic
8
electronic structure
8
mixed precision
8
matrix eigenvalue
8
eigenvalue problem
8

Similar Publications

Susceptibility formulation of density matrix perturbation theory.

J Chem Phys

December 2024

Division of Scientific Computing, Department of Information Technology, Uppsala University, Box 337, SE-751 05 Uppsala, Sweden.

Density matrix perturbation theory based on recursive Fermi-operator expansions provides a computationally efficient framework for time-independent response calculations in quantum chemistry and materials science. From a perturbation in the Hamiltonian, we can calculate the first-order perturbation in the density matrix, which then gives us the linear response in the expectation values for some chosen set of observables. We present an alternative, dual formulation, where we instead calculate the static susceptibility of an observable, which then gives us the linear response in the expectation values for any number of different Hamiltonian perturbations.

View Article and Find Full Text PDF

Optical neural networks (ONNs) are promising hardware platforms for next-generation neuromorphic computing due to their high parallelism, low latency, and low energy consumption. However, previous integrated photonic tensor cores (PTCs) consume numerous single-operand optical modulators for signal and weight encoding, leading to large area costs and high propagation loss to implement large tensor operations. This work proposes a scalable and efficient optical dot-product engine based on customized multi-operand photonic devices, namely multi-operand optical neuron (MOON).

View Article and Find Full Text PDF

Photonics offers a transformative approach to artificial intelligence (AI) and neuromorphic computing by enabling low-latency, high-speed, and energy-efficient computations. However, conventional photonic tensor cores face significant challenges in constructing large-scale photonic neuromorphic networks. Here, we propose a fully integrated photonic tensor core, consisting of only two thin-film lithium niobate (TFLN) modulators, a III-V laser, and a charge-integration photoreceiver.

View Article and Find Full Text PDF

Tensor Network State Algorithms on AI Accelerators.

J Chem Theory Comput

October 2024

Strongly Correlated Systems "Lendület" Research Group, Wigner Research Centre for Physics, H-1525 Budapest, Hungary.

We introduce novel algorithmic solutions for hybrid CPU-multiGPU tensor network state algorithms utilizing non-Abelian symmetries building on AI-motivated state-of-the-art hardware and software technologies. The presented numerical simulations on the FeMo cofactor, which plays a crucial role in converting atmospheric nitrogen to ammonia, are far beyond the scope of traditional approaches. Our large-scale (2) spin adapted density matrix renormalization group calculations up to bond dimension = 2 on complete active space (CAS) size of 18 electrons in 18 orbitals [CAS(18, 18)] demonstrate that the current limit of exact solution, i.

View Article and Find Full Text PDF

Efficient Mixed-Precision Matrix Factorization of the Inverse Overlap Matrix in Electronic Structure Calculations with AI-Hardware and GPUs.

J Chem Theory Comput

August 2024

Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States.

In recent years, a new kind of accelerated hardware has gained popularity in the artificial intelligence (AI) community which enables extremely high-performance tensor contractions in reduced precision for deep neural network calculations. In this article, we exploit Nvidia Tensor cores, a prototypical example of such AI-hardware, to develop a mixed precision approach for computing a dense matrix factorization of the inverse overlap matrix in electronic structure theory, . This factorization of , written as = , is used to transform the general matrix eigenvalue problem into a standard matrix eigenvalue problem.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!