Automated Retrosynthesis Planning of Macromolecules Using Large Language Models and Knowledge Graphs.

Macromol Rapid Commun

The State Key Laboratory of Molecular Engineering of Polymers, Research Center of Al for Polymer Science, Department of Macromolecular Science, Fudan University, Shanghai, 200433, China.

Published: February 2025

Identifying reliable synthesis pathways in materials chemistry is a complex task, particularly in polymer science, due to the intricate and often nonunique nomenclature of macromolecules. To address this challenge, an agent system that integrates large language models (LLMs) and knowledge graphs is proposed. By leveraging LLMs' powerful capabilities for extracting and recognizing chemical substance names, and storing the extracted data in a structured knowledge graph, the system fully automates the retrieval of relevant literature, extraction of reaction data, database querying, construction of retrosynthetic pathway trees, further expansion through the retrieval of additional literature and recommendation of optimal reaction pathways. By considering the complex interdependencies among chemical reactants, a novel Multi-branched Reaction Pathway Search Algorithm (MBRPS) is proposed to help identify all valid multi-branched reaction pathways, which arise when a single product decomposes into multiple reaction intermediates. In contrast, previous studies are limited to cases where a product decomposes into at most one reaction intermediate. This work represents the first attempt to develop a fully automated retrosynthesis planning agent tailored specially for macromolecules powered by LLMs. Applied to polyimide synthesis, the new approach constructs a retrosynthetic pathway tree with hundreds of pathways and recommends optimized routes, including both known and novel pathways.

Download full-text PDF

Source
http://dx.doi.org/10.1002/marc.202500065DOI Listing

Publication Analysis

Top Keywords

automated retrosynthesis
8
retrosynthesis planning
8
large language
8
language models
8
knowledge graphs
8
retrosynthetic pathway
8
reaction pathways
8
multi-branched reaction
8
product decomposes
8
reaction
6

Similar Publications

Nucleophilicity and electrophilicity are important properties for evaluating the reactivity and selectivity of chemical reactions. It allows the ranking of nucleophiles and electrophiles on reactivity scales, enabling a better understanding and prediction of reaction outcomes. Building upon our recent work (N.

View Article and Find Full Text PDF

Automated Retrosynthesis Planning of Macromolecules Using Large Language Models and Knowledge Graphs.

Macromol Rapid Commun

February 2025

The State Key Laboratory of Molecular Engineering of Polymers, Research Center of Al for Polymer Science, Department of Macromolecular Science, Fudan University, Shanghai, 200433, China.

Identifying reliable synthesis pathways in materials chemistry is a complex task, particularly in polymer science, due to the intricate and often nonunique nomenclature of macromolecules. To address this challenge, an agent system that integrates large language models (LLMs) and knowledge graphs is proposed. By leveraging LLMs' powerful capabilities for extracting and recognizing chemical substance names, and storing the extracted data in a structured knowledge graph, the system fully automates the retrieval of relevant literature, extraction of reaction data, database querying, construction of retrosynthetic pathway trees, further expansion through the retrieval of additional literature and recommendation of optimal reaction pathways.

View Article and Find Full Text PDF

With the advent of artificial intelligence (AI), it is now possible to design diverse and novel molecules from previously unexplored chemical space. However, a challenge for chemists is the synthesis of such molecules. Recently, there have been attempts to develop AI models for retrosynthesis prediction, which rely on the availability of a high-quality training dataset.

View Article and Find Full Text PDF

Automated synthesis planning has recently re-emerged as a research area at the intersection of chemistry and machine learning. Despite the appearance of steady progress, we argue that imperfect benchmarks and inconsistent comparisons mask systematic shortcomings of existing techniques, and unnecessarily hamper progress. To remedy this, we present a synthesis planning library with an extensive benchmarking framework, called SYNTHESEUS, which promotes best practice by default, enabling consistent meaningful evaluation of single-step and multi-step synthesis planning algorithms.

View Article and Find Full Text PDF

Motivation: Retrosynthesis identifies available precursor molecules for various and novel compounds. With the advancements and practicality of language models, Transformer-based models have increasingly been used to automate this process. However, many existing methods struggle to efficiently capture reaction transformation information, limiting the accuracy and applicability of their predictions.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!