Investigations into the Efficiency of Computer-Aided Synthesis Planning.

J Chem Inf Model

Molecular AI, Discovery Sciences, R&D, AstraZeneca, Pepparedsleden 1, 431 83 Mölndal, Sweden.

Published: February 2025

The efficiency of machine learning (ML) models is crucial to minimize inference times and reduce the carbon footprints of models deployed in production environments. Current models employed in retrosynthesis to generate a synthesis route from a target molecule to purchasable compounds are prohibitively slow. The model operates in a single-step fashion in a tree search algorithm by predicting reactant molecules given a product molecule as input. In this study, we investigate the ability of alternative transformer architectures, knowledge distillation (KD), and simple hyper-parameter optimization to decrease inference times of the Chemformer model. Initially, we assess the ability of closely related transformer architectures and conclude that these models under-performed when using KD. Additionally, we investigate the effects of feature-based and response-based KD together with hyper-parameters optimized based on inference sample time and model accuracy. We find that although reducing model size and improving single-step speed are important, our results indicate that multi-step search efficiency is more significantly influenced by the diversity and confidence of single-step models. Based on this work, further research should use KD in combination with other techniques, as multi-step speed continues to prevent proper integration of synthesis planning. However, in Monte Carlo-based (MC) multi-step retrosynthesis, other factors play a crucial role in balancing exploration and exploitation during the search process, often outweighing the direct impact of single-step model speed and carbon footprints.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11863376PMC
http://dx.doi.org/10.1021/acs.jcim.4c01821DOI Listing

Publication Analysis

Top Keywords

synthesis planning
8
inference times
8
carbon footprints
8
transformer architectures
8
models
5
model
5
investigations efficiency
4
efficiency computer-aided
4
computer-aided synthesis
4
planning efficiency
4

Similar Publications

Background: Acquired brain injury (ABI), including traumatic brain injury and hypoxic/anoxic injury, presents significant public health concerns; however, existing literature has focused primarily on male populations, such as military personnel and contact sports participants. Sex-related differences in ABI outcomes necessitate focused research due to potential heightened risk and distinct physiological responses among females.

Objectives: This pilot study aims to explore fluid-based biomarkers for neurological injury and inflammation in females experiencing intimate partner violence (IPV)-related assaults to the head, neck, or face.

View Article and Find Full Text PDF

Background: The macronutrient composition of daily meals plays a crucial role in influencing the body's metabolic responses during the postprandial phase. However, existing research on the effects of macronutrients, particularly fats and carbohydrates, has produced inconsistent findings.

Objectives: This study aims to evaluate the postprandial effects of two high-protein meals-one low in fat and high in carbohydrates (HP-LF-HC) and the other high in fat and low in carbohydrates (HP-HF-LC)-on energy metabolism, appetite response, and blood markers in overweight and obese men and women without underlying health conditions.

View Article and Find Full Text PDF

Computational tools for the prediction of site- and regioselectivity of organic reactions.

Chem Sci

March 2025

Molecular AI, Discovery Sciences, R&D, AstraZeneca Gothenburg Pepparedsleden 1 43183 Mölndal Sweden

The regio- and site-selectivity of organic reactions is one of the most important aspects when it comes to synthesis planning. Due to that, massive research efforts were invested into computational models for regio- and site-selectivity prediction, and the introduction of machine learning to the chemical sciences within the past decade has added a whole new dimension to these endeavors. This review article walks through the currently available predictive tools for regio- and site-selectivity with a particular focus on machine learning models while being organized along the individual reaction classes of organic chemistry.

View Article and Find Full Text PDF

Clinical practice guidelines (CPGs) are shared through various dissemination strategies using a range of dissemination products and channels. However, users may have different needs for accessing and understanding them. Patients and carers from low- and middle-income countries might face challenges in accessing CPGs such as inadequate systems for printed book distribution and insufficient and substandard photocopies.

View Article and Find Full Text PDF

The application of large language models in materials science has opened new avenues for accelerating materials development. Building on this advancement, we propose a novel framework leveraging large language models to optimize experimental procedures for synthesizing quantum dot materials with multiple desired properties. Our framework integrates the synthesis protocol generation model and the property prediction model, both fine-tuned on open-source large language models using parameter-efficient training techniques with in-house synthesis protocol data.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!