The RXN for Chemistry project, initiated by IBM Research Europe - Zurich in 2017, aimed to develop a series of digital assets using machine learning techniques to promote the use of data-driven methodologies in synthetic organic chemistry. This research adopts an innovative concept by treating chemical reaction data as language records, treating the prediction of a synthetic organic chemistry reaction as a translation task between precursor and product languages. Over the years, the IBM Research team has successfully developed language models for various applications including forward reaction prediction, retrosynthesis, reaction classification, atom-mapping, procedure extraction from text, inference of experimental protocols and its use in programming commercial automation hardware to implement an autonomous chemical laboratory.
View Article and Find Full Text PDFExperimental procedures for chemical synthesis are commonly reported in prose in patents or in the scientific literature. The extraction of the details necessary to reproduce and validate a synthesis in a chemical laboratory is often a tedious task requiring extensive human intervention. We present a method to convert unstructured experimental procedures written in English to structured synthetic steps (action sequences) reflecting all the operations needed to successfully conduct the corresponding chemical reactions.
View Article and Find Full Text PDF