Electrochemical C-H oxidation reactions offer a sustainable route to functionalize hydrocarbons, yet identifying suitable substrates and optimizing synthesis remain challenging. Here, we report an integrated approach combining machine learning and large language models to streamline the exploration of electrochemical C-H oxidation reactions. Utilizing a batch rapid screening electrochemical platform, we evaluated a wide range of reactions, initially classifying substrates by their reactivity, while LLMs text-mined literature data to augment the training set.
View Article and Find Full Text PDFThe mechanism of Pd-catalyzed amination of five-membered heteroaryl halides was investigated by integrating experimental kinetic analysis with kinetic modeling through predictive testing and likelihood ratio analysis, revealing an atypical productive coupling pathway and multiple off-cycle events. The GPhos-supported Pd catalyst, along with the moderate-strength base NaOTMS, was previously found to promote efficient coupling between five-membered heteroaryl halides and secondary amines. However, slight deviations from the optimal concentration, temperature, and/or solvent resulted in significantly lower yields, contrary to typical reaction optimization trends.
View Article and Find Full Text PDFReaction optimization and characterization depend on reliable measures of reaction yield, often measured by high-performance liquid chromatography (HPLC). Peak areas in HPLC chromatograms are correlated to analyte concentrations by way of calibration standards, typically pure samples of known concentration. Preparing the pure material required for calibration runs can be tedious for low-yielding reactions and technically challenging at small reaction scales.
View Article and Find Full Text PDFFunctionalization of lead compounds to create analogs is a challenging step in discovering new molecules with desired properties and it is conducted throughout the chemical industry, including pharmaceuticals and agrochemicals. The process can be time-consuming and expensive, requiring expert intuition and experience. To help address synthesis planning challenges in late-stage functionalization, we have developed a molecular similarity approach that proposes single-step functionalization reactions based on analogy to precedent reactions.
View Article and Find Full Text PDFA closed-loop, autonomous molecular discovery platform driven by integrated machine learning tools was developed to accelerate the design of molecules with desired properties. We demonstrated two case studies on dye-like molecules, targeting absorption wavelength, lipophilicity, and photooxidative stability. In the first study, the platform experimentally realized 294 unreported molecules across three automatic iterations of molecular design-make-test-analyze cycles while exploring the structure-function space of four rarely reported scaffolds.
View Article and Find Full Text PDFThe presence of solids as starting reagents/reactants or products in flow photochemical reactions can lead to reactor clogging and yield reduction from side reactions. We address this limitation with a new ultrasonic microreactor for continuous solid-laden photochemical reactions. The ultrasonic photochemical microreactor is characterized by the liquid and solid residence time distribution (RTD) and the absorbed photon flux in the reactor chemical actinometry.
View Article and Find Full Text PDFWe present an automated droplet reactor platform possessing parallel reactor channels and a scheduling algorithm that orchestrates all of the parallel hardware operations and ensures droplet integrity as well as overall efficiency. We design and incorporate all of the necessary hardware and software to enable the platform to be used to study both thermal and photochemical reactions. We incorporate a Bayesian optimization algorithm into the control software to enable reaction optimization over both categorical and continuous variables.
View Article and Find Full Text PDFChemoenzymatic synthesis methods use organic and enzyme chemistry to synthesize a desired small molecule. Complementing organic synthesis with enzyme-catalyzed selective transformations under mild conditions enables more sustainable and synthetically efficient chemical manufacturing. Here, we present a multistep retrosynthesis search algorithm to facilitate chemoenzymatic synthesis of pharmaceutical compounds, specialty chemicals, commodity chemicals, and monomers.
View Article and Find Full Text PDFThe Community Resource for Innovation in Polymer Technology (CRIPT) data model is designed to address the high complexity in defining a polymer structure and the intricacies involved with characterizing material properties.
View Article and Find Full Text PDFAutomation and digitalization solutions in the field of small molecule synthesis face new challenges for chemical reaction analysis, especially in the field of high-performance liquid chromatography (HPLC). Chromatographic data remains locked in vendors' hardware and software components, limiting their potential in automated workflows and data science applications. In this work, we present an open-source Python project called MOCCA for the analysis of HPLC-DAD (photodiode array detector) raw data.
View Article and Find Full Text PDFThe molecular structures synthesizable by organic chemists dictate the molecular functions they can create. The invention and development of chemical reactions are thus critical for chemists to access new and desirable functional molecules in all disciplines of organic chemistry. This work seeks to expedite the exploration of emerging areas of organic chemistry by devising a machine-learning-guided workflow for reaction discovery.
View Article and Find Full Text PDFFor many experimentally measured chemical properties that cannot be directly computed from first-principles, the existing physics-based models do not extrapolate well to out-of-sample molecules, and experimental datasets themselves are too small for traditional machine learning (ML) approaches. To overcome these limitations, we apply a transfer learning approach, whereby we simultaneously train a multi-target regression model on a small number of molecules with experimentally measured values and a large number of molecules with related computed properties. We demonstrate this methodology on predicting the experimentally measured impact sensitivity of energetic crystals, finding that both characteristics of the computed dataset and model architecture are important to prediction accuracy of the small experimental dataset.
View Article and Find Full Text PDFComputer-aided synthesis planning (CASP) tools can propose retrosynthetic pathways and forward reaction conditions for the synthesis of organic compounds, but the limited availability of context-specific data currently necessitates experimental development to fully specify process details. We plan and optimize a CASP-proposed and human-refined multistep synthesis route toward an exemplary small molecule, sonidegib, on a modular, robotic flow synthesis platform with integrated process analytical technology (PAT) for data-rich experimentation. Human insights address catalyst deactivation and improve yield by strategic choices of order of addition.
View Article and Find Full Text PDFEnzymes synthesize complex natural products effortlessly by catalyzing chemo-, regio-, and enantio-selective transformations. Further, biocatalytic processes are increasingly replacing conventional organic synthesis steps because they use mild solvents, avoid the use of metals, and reduce overall non-biodegradable waste. Here, we present a single-step retrosynthesis search algorithm to facilitate enzymatic synthesis of natural product analogs.
View Article and Find Full Text PDFThe implementation of self-optimizing flow reactors has been mostly limited to model reactions or known synthesis routes. In this work, a self-optimizing flow photochemistry platform is used to develop an original synthesis of the bioactive fragment of Salbutamol and derivatives. The key photochemical steps for the construction of the aryl vicinyl amino alcohol moiety consist of a C-C bond forming reaction followed by an unprecedented, high yielding (>80 %), benzylic oxidative cyclization.
View Article and Find Full Text PDFAutomation and microfluidic tools potentially enable efficient, fast, and focused reaction development of complex chemistries, while minimizing resource- and material consumption. The introduction of automation-assisted workflows will contribute to the more sustainable development and scale-up of new and improved catalytic technologies. Herein, the application of automation and microfluidics to the development of a complex asymmetric hydrogenation reaction is described.
View Article and Find Full Text PDFCD8 T cell responses are the foundation of the recent clinical success of immunotherapy in oncologic indications. Although checkpoint inhibitors have enhanced the activity of existing CD8 T cell responses, therapeutic approaches to generate Ag-specific CD8 T cell responses have had limited success. Here, we demonstrate that cytosolic delivery of Ag through microfluidic squeezing enables MHC class I presentation to CD8 T cells by diverse cell types.
View Article and Find Full Text PDFChemical reaction data in journal articles, patents, and even electronic laboratory notebooks are currently stored in various formats, often unstructured, which presents a significant barrier to downstream applications, including the training of machine-learning models. We present the Open Reaction Database (ORD), an open-access schema and infrastructure for structuring and sharing organic reaction data, including a centralized data repository. The ORD schema supports conventional and emerging technologies, from benchtop reactions to automated high-throughput experiments and flow chemistry.
View Article and Find Full Text PDFAccurate and rapid evaluation of whether substrates can undergo the desired the transformation is crucial and challenging for both human knowledge and computer predictions. Despite the potential of machine learning in predicting chemical reactivity such as selectivity, popular feature engineering and learning methods are either time-consuming or data-hungry. We introduce a new method that combines machine-learned reaction representation with selected quantum mechanical descriptors to predict regio-selectivity in general substitution reactions.
View Article and Find Full Text PDFWith recent advances in the computer-aided synthesis planning (CASP) powered by data science and machine learning, modern CASP programs can rapidly identify thousands of potential pathways for a given target molecule. However, the lack of a holistic pathway evaluation mechanism makes it challenging to systematically prioritize strategic pathways except for using some simple heuristics. Herein, we introduce a data-driven approach to evaluate the relative strategic levels of retrosynthesis pathways using a dynamic tree-structured long short-term memory (tree-LSTM) model.
View Article and Find Full Text PDFAccess to structured chemical reaction data is of key importance for chemists in performing bench experiments and in modern applications like computer-aided drug design. Existing reaction databases are generally populated by human curators through manual abstraction from published literature (e.g.
View Article and Find Full Text PDFThe synthesis of thousands of candidate compounds in drug discovery and development offers opportunities for computer-aided synthesis planning to simplify the synthesis of molecule libraries by leveraging common starting materials and reaction conditions. We develop an optimization-based method to analyze large organic chemical reaction networks and design overlapping synthesis plans for entire molecule libraries so as to minimize the overall number of unique chemical compounds needed as either starting materials or reaction conditions. We consider multiple objectives, including the number of starting materials, the number of catalysts/solvents/reagents, and the likelihood of success of the overall syntheses plan, to select an optimal reaction network to access the target molecules.
View Article and Find Full Text PDFElectroorganic synthesis is a promising tool to design sustainable transformations and discover new reactivities. However, the added setup complexity caused by electrodes in the system impedes efficient screening of reaction conditions. Herein, we present a microfluidic platform that enables automated high-throughput experimentation (HTE) for electroorganic synthesis at a 15-microliter scale.
View Article and Find Full Text PDF