C-H borylation is a high-value transformation in the synthesis of lead candidates for the pharmaceutical industry because a wide array of downstream coupling reactions is available. However, predicting its regioselectivity, especially in drug-like molecules that may contain multiple heterocycles, is not a trivial task. Using a data set of borylation reactions from Reaxys, we explored how a language model originally trained on USPTO_500_MT, a broad-scope set of patent data, can be used to predict the C-H borylation reaction product in different modes: product generation and site reactivity classification.
View Article and Find Full Text PDFIn the pursuit of improved compound identification and database search tasks, this study explores heteronuclear single quantum coherence (HSQC) spectra simulation and matching methodologies. HSQC spectra serve as unique molecular fingerprints, enabling a valuable balance of data collection time and information richness. We conducted a comprehensive evaluation of the following four HSQC simulation techniques: ACD/Labs (ACD), MestReNova (MNova), Gaussian NMR calculations (DFT), and a graph-based neural network (ML).
View Article and Find Full Text PDFAngew Chem Weinheim Bergstr Ger
June 2023
The epigenetic modification 5-methylcytosine plays a vital role in development, cell specific gene expression and disease states. The selective chemical modification of the 5-methylcytosine methyl group is challenging. Currently, no such chemistry exists.
View Article and Find Full Text PDFOwing to its high natural abundance compared to the commonly used transition (precious) metals, as well as its high Lewis acidity and ability to change oxidation state, aluminium has recently been explored as the basis for a range of single-site catalysts. This paper aims to establish the ground rules for the development of a new type of cationic alkene oligomerisation catalyst containing two Al(III) ions, with the potential to act co-operatively in stereoselective assembly. Five new dimers of the type [RAl(2-py')] (R=Me, Bu; py'=substituted pyridyl group) with different substituents on the Al atoms and pyridyl rings have been synthesised.
View Article and Find Full Text PDFThe selectivity in a group of oxazaborolidinium ion-catalysed reactions between aldehyde and diazo compounds cannot be explained using transition state theory. VRAI-selectivity, developed to predict the outcome of dynamically controlled reactions, can account for both the chemo- and the stereo-selectivity in these reactions, which are controlled by reaction dynamics. Subtle modifications to the substrate or catalyst substituents alter the potential energy surface, leading to changes in predominant reaction pathways and altering the barriers to the major product when reaction dynamics are considered.
View Article and Find Full Text PDFMachine Learning (ML) is increasingly applied to fill data gaps in assessments to quantify impacts associated with chemical emissions and chemicals in products. However, the systematic application of ML-based approaches to fill chemical data gaps is still limited, and their potential for addressing a wide range of chemicals is unknown. We prioritized chemical-related parameters for chemical toxicity characterization to inform ML model development based on two criteria: (1) each parameter's relevance to robustly characterize chemical toxicity described by the uncertainty in characterization results attributable to each parameter and (2) the potential for ML-based approaches to predict parameter values for a wide range of chemicals described by the availability of chemicals with measured parameter data.
View Article and Find Full Text PDFJ Chem Inf Model
July 2023
CONFPASS (ormer rioritizations and nalyi for DFT re-optimizations) has been developed to extract dihedral angle descriptors from conformational searching outputs, perform clustering, and return a priority list for density functional theory (DFT) re-optimizations. Evaluations were conducted with DFT data of the conformers for 150 structurally diverse molecules, most of which are flexible. CONFPASS gives a confidence estimate that the global minimum structure has been found, and based on our dataset, we can have 90% confidence after optimizing half of the FF structures.
View Article and Find Full Text PDFThe epigenetic modification 5-methylcytosine plays a vital role in development, cell specific gene expression and disease states. The selective chemical modification of the 5-methylcytosine methyl group is challenging. Currently, no such chemistry exists.
View Article and Find Full Text PDFVibrational circular dichroism (VCD) spectroscopy can generate the data required for the assignment of absolute configuration, but the spectra are hard to interpret. We have recorded VCD data for thirty pairs of small organic compounds and we use this database to validate a method for the automated analysis of VCD spectra and the assignment of absolute configuration: the Cai•factor (Configuration: absolute information). The analysis of the data demonstrates that the procedure is a reliable and time-efficient method for determination of absolute configuration, which gives both the assignment and a measure of confidence in the outcome, even when the spectra are imperfect.
View Article and Find Full Text PDFComputational reaction prediction has become a ubiquitous task in chemistry due to the potential value accurate predictions can bring to chemists. Boronic acids are widely used in industry; however, understanding how to avoid the protodeboronation side reaction remains a challenge. We have developed an algorithm for prediction of the rate of protodeboronation of boronic acids.
View Article and Find Full Text PDFThe use of machine learning techniques in computational chemistry has gained significant momentum since large molecular databases are now readily available. Predictions of molecular properties using machine learning have advantages over the traditional quantum mechanics calculations because they can be cheaper computationally without losing the accuracy. We present a new extrapolatable and explainable molecular representation based on bonds, angles and dihedrals that can be used to train machine learning models.
View Article and Find Full Text PDFWhenever a new molecule is made, a chemist will justify the proposed structure by analysing the NMR spectra. The widely-used DP4 algorithm will choose the best match from a series of possibilities, but draws no conclusions from a single candidate structure. Here we present the DP5 probability, a step-change in the quantification of molecular uncertainty: given one structure and one C NMR spectra, DP5 gives the probability of the structure being correct.
View Article and Find Full Text PDFOrg Biomol Chem
November 2021
-Triflylphosphoramides (NTPA), have become increasingly popular catalysts in the development of enantioselective transformations as they are stronger Brønsted acids than the corresponding phosphoric acids (PA). Their highly acidic, asymmetric active site can activate difficult, unreactive substrates. In this review, we present an account of asymmetric transformations using this type of catalyst that have been reported in the past ten years and we classify these reactions using the enantio-determining step as the key criterion.
View Article and Find Full Text PDFDeep learning neural networks, constructed for the prediction of chemical binding at 79 pharmacologically important human biological targets, show extremely high performance on test data (accuracy 92.2 ± 4.2%, MCC 0.
View Article and Find Full Text PDFThe software for the IUPAC Chemical Identifier, InChI, is extraordinarily reliable. It has been tested on large databases around the world, and has proved itself to be an essential tool in the handling and integration of large chemical databases. InChI version 1.
View Article and Find Full Text PDFOrg Biomol Chem
May 2021
In recent years, a growing number of organic reactions in the literature have shown selectivity controlled by reaction dynamics rather than by transition state theory. Such reactions are difficult to analyse because the transition state theory approach often does not capture the subtlety of the energy landscapes the compounds traverse and, therefore, cannot accurately predict the selectivity. We present an algorithm that can predict the major product and selectivity for a wide range of potential energy surfaces where the product distribution is influenced by reaction dynamics.
View Article and Find Full Text PDFBackground: Humans are exposed to tens of thousands of chemical substances that need to be assessed for their potential toxicity. Acute systemic toxicity testing serves as the basis for regulatory hazard classification, labeling, and risk management. However, it is cost- and time-prohibitive to evaluate all new and existing chemicals using traditional rodent acute toxicity tests.
View Article and Find Full Text PDFIn recent times, machine learning has become increasingly prominent in predictive toxicology as it has shifted from studies toward studies. Currently, methods together with other computational methods such as quantitative structure-activity relationship modeling and absorption, distribution, metabolism, and excretion calculations are being used. An overview of machine learning and its applications in predictive toxicology is presented here, including support vector machines (SVMs), random forest (RF) and decision trees (DTs), neural networks, regression models, naïve Bayes, -nearest neighbors, and ensemble learning.
View Article and Find Full Text PDFHaving a measure of confidence in computational predictions of biological activity from tools is vital when making predictions for new chemicals, for example, in chemical risk assessment. Where predictions of biological activity are used as an indicator of a potential hazard, false-negative predictions are the most concerning prediction; however, assigning confidence in inactive predictions is particularly challenging. How can one confidently identify the absence of activating features? In this study, we present methods for assigning confidence to both active and inactive predictions from structural alerts for protein-binding molecular initiating events (MIEs).
View Article and Find Full Text PDFThe Minisci reaction is one of the most valuable methods for directly functionalizing basic heteroarenes to form carbon-carbon bonds. Use of prochiral, heteroatom-substituted radicals results in stereocenters being formed adjacent to the heteroaromatic system, generating motifs which are valuable in medicinal chemistry and chiral ligand design. Recently a highly enantioselective and regioselective protocol for the Minisci reaction was developed, using chiral phosphoric acid catalysis.
View Article and Find Full Text PDFCharacterization of the complex molecular scaffold of the marine polyketide natural product phormidolide A represents a challenge that has persisted for nearly two decades. In light of discordant results arising from recent synthetic and biosynthetic reports, a rigorous study of the configuration of phormidolide A was necessary. This report outlines a synergistic effort employing computational and anisotropic NMR investigation, that provided orthogonal confirmation of the reassigned side chain, as well as supporting a further correction of the C7 stereocenter.
View Article and Find Full Text PDFMolecular initiating events (MIEs) are key events in adverse outcome pathways that link molecular chemistry to target biology. As they are based on chemistry, these interactions are excellent targets for computational chemistry approaches to in silico modeling. In this work, we aim to link ligand chemical structures to MIEs for androgen receptor (AR) and glucocorticoid receptor (GR) binding using ToxCast data.
View Article and Find Full Text PDF