Predicting Experimental Heats of Formation via Deep Learning with Limited Experimental Data.

GuanYa Yang Wai Yuet Chiu Jiang Wu Yi Zhou ShuGuang Chen WeiJun Zhou Jiaqi Fan GuanHua Chen

J Phys Chem A

Department of Chemistry, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, China.

Published: September 2022

The study addresses the challenge of insufficient experimental data for training deep learning models to predict molecular properties, proposing a method that combines pretraining on quantum mechanical results with fine-tuning on limited experimental data.
The approach utilizes graph neural networks for pretraining, aiming to achieve molecular property predictions that are close to experimental accuracy, capitalizing on the qualitative correctness of quantum methods.
The model is applied to calculate the heats of formation for organic molecules using just 405 experimental data points, achieving a mean absolute error of 1.8 kcal/mol, demonstrating its efficiency and effectiveness.

When it comes to predicting experimental values of molecular properties with deep learning, the key problem is the lack of sufficient experimental data for training. We propose a method that consists of pretraining a graph neural network that aims to reproduce first-principles quantum mechanical results, followed by fine-tuning of a fully connected neural network against experimental results. The combined pretraining and fine-tuning model is expected to yield molecular properties close to experimental accuracy. This is made possible because first-principles quantum mechanical methods are often qualitatively correct or semiquantitatively accurate; thus, a calibration of the calculation results against high-precision but limited experiment data can improve accuracy greatly. Moreover, the method is highly efficient, as first-principles quantum mechanical calculation is bypassed. To demonstrate this, we apply the combined model to determine the experimental heats of formation of organic molecules made of H, C, O, N, or F atoms (up to 30 atoms), where mere 405 experimental data are used. The overall mean absolute error is 1.8 kcal/mol for these molecules.

Download full-text PDF	Source
http://dx.doi.org/10.1021/acs.jpca.2c02957	DOI Listing

Publication Analysis

Top Keywords

experimental data

first-principles quantum

quantum mechanical

predicting experimental

experimental heats

heats formation

deep learning

molecular properties

neural network

experimental

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!