There has been great progress in developing methods for machine-learned potential energy surfaces. There have also been important assessments of these methods by comparing so-called learning curves on datasets of electronic energies and forces, notably the MD17 database. The dataset for each molecule in this database generally consists of tens of thousands of energies and forces obtained from DFT direct dynamics at 500 K. We contrast the datasets from this database for three "small" molecules, ethanol, malonaldehyde, and glycine, with datasets we have generated with specific targets for the potential energy surfaces (PESs) in mind: a rigorous calculation of the zero-point energy and wavefunction, the tunneling splitting in malonaldehyde, and, in the case of glycine, a description of all eight low-lying conformers. We found that the MD17 datasets are too limited for these targets. We also examine recent datasets for several PESs that describe small-molecule but complex chemical reactions. Finally, we introduce a new database, "QM-22," which contains datasets of molecules ranging from 4 to 15 atoms that extend to high energies and a large span of configurations.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1063/5.0089200 | DOI Listing |
Sci Data
February 2024
Department of Chemistry, University of Minnesota, Minneapolis, MN, 55414, USA.
System specific neural force fields (NFFs) have gained popularity in computational chemistry. One of the most popular datasets as a bencharmk to develop NFF models is the MD17 dataset and its subsequent extension. These datasets comprise geometries from the equilibrium region of the ground electronic state potential energy surface, sampled from direct adiabatic dynamics.
View Article and Find Full Text PDFSci Rep
January 2024
Department of Radiation Oncology, Brigham and Women's Hospital, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA.
Manual segmentation of tumors and organs-at-risk (OAR) in 3D imaging for radiation-therapy planning is time-consuming and subject to variation between different observers. Artificial intelligence (AI) can assist with segmentation, but challenges exist in ensuring high-quality segmentation, especially for small, variable structures, such as the esophagus. We investigated the effect of variation in segmentation quality and style of physicians for training deep-learning models for esophagus segmentation and proposed a new metric, edge roughness, for evaluating/quantifying slice-to-slice inconsistency.
View Article and Find Full Text PDFNat Commun
January 2024
Microsoft Research AI4Science, 100080, Beijing, China.
Geometric deep learning has been revolutionizing the molecular modeling field. Despite the state-of-the-art neural network models are approaching ab initio accuracy for molecular property prediction, their applications, such as drug discovery and molecular dynamics (MD) simulation, have been hindered by insufficient utilization of geometric information and high computational costs. Here we propose an equivariant geometry-enhanced graph neural network called ViSNet, which elegantly extracts geometric features and efficiently models molecular structures with low computational costs.
View Article and Find Full Text PDFJ Chem Inf Model
March 2023
Engineering Laboratory of Advanced Energy Materials, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo 315201, China.
This paper proposes a new interatomic potential energy neural network, AisNet, which can efficiently predict atomic energies and forces covering different molecular and crystalline materials by encoding universal local environment features, such as elements and atomic positions. Inspired by the framework of SchNet, AisNet consists of an encoding module combining autoencoder with embedding, the triplet loss function and an atomic central symmetry function (ACSF), an interaction module with a periodic boundary condition (PBC), and a prediction module. In molecules, the prediction accuracy of AisNet is comparabel with SchNet on the MD17 dataset, mainly attributed to the effective capture of chemical functional groups through the interaction module.
View Article and Find Full Text PDFJ Chem Theory Comput
January 2023
Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States.
There has been great progress in developing machine-learned potential energy surfaces (PESs) for molecules and clusters with more than 10 atoms. Unfortunately, this number of atoms generally limits the level of electronic structure theory to less than the "gold standard" CCSD(T) level. Indeed, for the well-known MD17 dataset for molecules with 9-20 atoms, all of the energies and forces were obtained with DFT calculations (PBE).
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!