Molecular Contrastive Pretraining with Collaborative Featurizations.

J Chem Inf Model

Center for Research on Intelligent Perception and Computing, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China.

Published: February 2024

AI Article Synopsis

  • Molecular pretraining uses large amounts of unlabeled data to create molecular representations, making it essential in computational chemistry and drug discovery.
  • The effectiveness of different molecular featurizations (like 1D SMILES strings, 2D graphs, and 3D geometries) in this process hasn't been fully explored yet.
  • The study introduces a new framework called MOCO, which integrates various featurizations to enhance performance in molecular property predictions, showing significant improvements over traditional models that only use one or two types of features.

Article Abstract

Molecular pretraining, which learns molecular representations over massive unlabeled data, has become a prominent paradigm to solve a variety of tasks in computational chemistry and drug discovery. Recently, prosperous progress has been made in molecular pretraining with different molecular featurizations, including 1D SMILES strings, 2D graphs, and 3D geometries. However, the role of molecular featurizations with their corresponding neural architectures in molecular pretraining remains largely unexamined. In this paper, through two case studies─chirality classification and aromatic ring counting─we first demonstrate that different featurization techniques convey chemical information differently. In light of this observation, we propose a simple and effective MOlecular pretraining framework with COllaborative featurizations (MOCO). MOCO comprehensively leverages multiple featurizations that complement each other and outperforms existing state-of-the-art models that solely rely on one or two featurizations on a wide range of molecular property prediction tasks.

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jcim.3c01468DOI Listing

Publication Analysis

Top Keywords

molecular pretraining
16
molecular
9
collaborative featurizations
8
molecular featurizations
8
featurizations
6
pretraining
5
molecular contrastive
4
contrastive pretraining
4
pretraining collaborative
4
featurizations molecular
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!