Effect of Flattened Structures of Molecules and Materials on Machine Learning Model Training.

J Chem Inf Model

Center of Mathematics, Computation and Cognition, Federal University of ABC, Avenida dos Estados, 5001, 09210-580 Santo André, São Paulo, Brazil.

Published: September 2023

A key aspect of producing accurate and reliable machine learning models for the prediction of properties of quantum chemistry (QC) data is identifying possible data characteristics that may negatively influence model training. In previous work, we identified that molecules and materials with a low volume of the convex hull (VCH) of atomic positions may be harmful in model training and a source of prediction outliers. In this paper, we extend this analysis further and develop a biased sampling study to evaluate the influence of VCH on the training data of a model using different structures of molecules and materials. Our study confirms that VCH influences model training and shows the importance of using homogeneous geometric characteristics of structures when building new data sets or selecting training sets from larger QC data sets.

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jcim.3c00242DOI Listing

Publication Analysis

Top Keywords

model training
16
molecules materials
12
structures molecules
8
machine learning
8
data sets
8
training
6
model
5
data
5
flattened structures
4
materials machine
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!