Size Doesn't Matter: Predicting Physico- or Biochemical Properties Based on Dozens of Molecules.

J Phys Chem Lett

Science Data Software, LLC, 14909 Forest Landing Circle, Rockville, Maryland 20850, United States.

Published: September 2021

The use of machine learning in chemistry has become a common practice. At the same time, despite the success of modern machine learning methods, the lack of data limits their use. Using a transfer learning methodology can help solve this problem. This methodology assumes that a model built on a sufficient amount of data captures general features of the chemical compound structure on which it was trained and that the further reuse of these features on a data set with a lack of data will greatly improve the quality of the new model. In this paper, we develop this approach for small organic molecules, implementing transfer learning with graph convolutional neural networks. The paper shows a significant improvement in the performance of the models for target properties with a lack of data. The effects of the data set composition on the model's quality and the applicability domain of the resulting models are also considered.

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jpclett.1c02477DOI Listing

Publication Analysis

Top Keywords

lack data
12
machine learning
8
transfer learning
8
data set
8
data
6
size matter
4
matter predicting
4
predicting physico-
4
physico- biochemical
4
biochemical properties
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!