The primary focus of GAMESS over the last 5 years has been the development of new high-performance codes that are able to take effective and efficient advantage of the most advanced computer architectures, both CPU and accelerators. These efforts include employing density fitting and fragmentation methods to reduce the high scaling of well-correlated (e.g.
View Article and Find Full Text PDFElectronic structure calculations have the potential to predict key matter transformations for applications of strategic technological importance, from drug discovery to material science and catalysis. However, a predictive physicochemical characterization of these processes often requires accurate quantum chemical modeling of complex molecular systems with hundreds to thousands of atoms. Due to the computationally demanding nature of electronic structure calculations and the complexity of modern high-performance computing hardware, quantum chemistry software has historically failed to operate at such large molecular scales with accuracy and speed that are useful in practice.
View Article and Find Full Text PDFA novel implementation of the self-consistent field (SCF) procedure specifically designed for high-performance execution on multiple graphics processing units (GPUs) is presented. The algorithm offloads to GPUs the three major computational stages of the SCF, namely, the calculation of one-electron integrals, the calculation and digestion of electron repulsion integrals, and the diagonalization of the Fock matrix, including SCF acceleration via DIIS. Performance results for a variety of test molecules and basis sets show remarkable speedups with respect to the state-of-the-art parallel GAMESS CPU code and relative to other widely used GPU codes for both single and multi-GPU execution.
View Article and Find Full Text PDFWe present a high-performance, GPU (graphics processing unit)-accelerated algorithm for building the Fock matrix. The algorithm is designed for efficient calculations on large molecular systems and uses a novel dynamic load balancing scheme that maximizes the GPU throughput and avoids thread divergence that could occur due to integral screening. Additionally, the code adopts a novel ERI digestion algorithm that exploits all forms of permutational symmetry, combines efficiently the evaluation of both Coulomb and exchange terms together, and eliminates explicit thread synchronization requirements.
View Article and Find Full Text PDFElectronic structure theory (especially quantum chemistry) has thrived and has become increasingly relevant to a broad spectrum of scientific endeavors as the sophistication of both computer architectures and software engineering has advanced. This article provides a brief history of advances in both hardware and software, from the early days of IBM mainframes to the current emphasis on accelerators and modern programming practices.
View Article and Find Full Text PDFA discussion of many of the recently implemented features of GAMESS (General Atomic and Molecular Electronic Structure System) and LibCChem (the C++ CPU/GPU library associated with GAMESS) is presented. These features include fragmentation methods such as the fragment molecular orbital, effective fragment potential and effective fragment molecular orbital methods, hybrid MPI/OpenMP approaches to Hartree-Fock, and resolution of the identity second order perturbation theory. Many new coupled cluster theory methods have been implemented in GAMESS, as have multiple levels of density functional/tight binding theory.
View Article and Find Full Text PDFThe computational efficiency and energy-to-solution of several applications using the GAMESS quantum chemistry suite of codes is evaluated for 32-bit and 64-bit ARM-based computers, and compared to an x86 machine. The x86 system completes all benchmark computations more quickly than either ARM system and is the best choice to minimize time to solution. The ARM64 and ARM32 computational performances are similar to each other for Hartree-Fock and density functional theory energy calculations.
View Article and Find Full Text PDFUse of the modern parallel programming language X10 for computing long-range Coulomb and exchange interactions is presented. By using X10, a partitioned global address space language with support for task parallelism and the explicit representation of data locality, the resolution of the Ewald operator can be parallelized in a straightforward manner including use of both intranode and internode parallelism. We evaluate four different schemes for dynamic load balancing of integral calculation using X10's work stealing runtime, and report performance results for long-range HF energy calculation of large molecule/high quality basis running on up to 1024 cores of a high performance cluster machine.
View Article and Find Full Text PDFIncreasingly, modern computer systems comprise a multicore general-purpose processor augmented with a number of special purpose devices or accelerators connected via an external interface such as a PCI bus. The NVIDIA Kepler Graphical Processing Unit (GPU) and the Intel Phi are two examples of such accelerators. Accelerators offer peak performances that can be well above those of the host processor.
View Article and Find Full Text PDFUse of the resolution of Ewald operator method for computing long-range Coulomb and exchange interactions is presented. We show that the accuracy of this method can be controlled by a single parameter in a manner similar to that used by conventional algorithms that compute two-electron integrals. Significant performance advantages over conventional algorithms are observed, particularly for high quality basis sets and globular systems.
View Article and Find Full Text PDFThe use of gallium for cleaning hydrogen-contaminated Al2O3 surfaces is explored by performing first principles density functional calculations of gallium adsorption on a hydrogen-contaminated Al-terminated α-Al2O3(0001) surface. Both physisorbed and chemisorbed H-contaminated α-Al2O3(0001) surfaces with one monolayer (ML) gallium coverage are investigated. The thermodynamics of gallium cleaning are considered for a variety of different asymptotic products, and are found to be favorable in all cases.
View Article and Find Full Text PDFThe simulation of nonlinear ultrasound propagation through tissue realistic media has a wide range of practical applications. However, this is a computationally difficult problem due to the large size of the computational domain compared to the acoustic wavelength. Here, the k-space pseudospectral method is used to reduce the number of grid points required per wavelength for accurate simulations.
View Article and Find Full Text PDFFor intermediate sized chemical systems the use of an auxiliary basis set (ABS) to fit the charge density provides a useful means of accelerating the performance of various quantum chemical methods. As a consequence much effort has been devoted to the design of various ABSs. This paper explores a fundamentally new approach where the ABS is created dynamically based on the specific orbital basis set (OBS) being used.
View Article and Find Full Text PDFThe adsorption of Ga atoms in low coverage on the Al-terminated alpha-Al(2)O(3)(0001) surface has been studied theoretically by using first principles periodic boundary condition (PBC) calculations within the framework of the generalized gradient approximation (GGA). Eight possible adsorption sites are investigated, but only two are found to correspond to stationary points. Both of these locations are characterized as hollow sites, with three surrounding surface O atoms and an Al atom in the center located deeper within the Al(2)O(3) slab.
View Article and Find Full Text PDFQM/MM methods have been developed as a computationally feasible solution to QM simulation of chemical processes, such as enzyme-catalyzed reactions, within a more approximate MM representation of the condensed-phase environment. However, there has been no independent method for checking the quality of this representation, especially for highly nonisotropic protein environments such as those surrounding enzyme active sites. Hence, the validity of QM/MM methods is largely untested.
View Article and Find Full Text PDF