Artificial Intelligence is revolutionizing many aspects of the pharmaceutical industry. Deep learning models are now routinely applied to guide drug discovery projects leading to faster and improved findings, but there are still many tasks with enormous unrealized potential. One such task is the reaction yield prediction. Every year more than one fifth of all synthesis attempts result in product yields which are either zero or too low. This equates to chemical and human resources being spent on activities which ultimately do not progress the programs, leading to a triple loss when accounting for the cost of opportunity in time wasted. In this work we pre-train a BERT model on more than 16 million reactions from 4 different data sources, and fine tune it to achieve an uncertainty calibrated global yield prediction model. This model is an improvement upon state of the art not just from the increase in pre-train data but also by introducing a new embedding layer which solves a few limitations of SMILES and enables integration of additional information such as equivalents and molecule role into the reaction encoding, the model is called BERT Enriched Embedding (BEE). The model is benchmarked on an open-source dataset against a state-of-the-art synthesis focused BERT showing a near 20-point improvement in r2 score. The model is fine-tuned and tested on an internal company data benchmark, and a prospective study shows that the application of the model can reduce the total number of negative reactions (yield under 5%) ran in Janssen by at least 34%. Lastly, we corroborate the previous results through experimental validation, by directly deploying the model in an on-going drug discovery project and showing that it can also be used successfully as a reagent recommender due to its fast inference speed and reliable confidence estimation, a critical feature for industry application.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9921076 | PMC |
http://dx.doi.org/10.1186/s13321-023-00685-0 | DOI Listing |
Environ Sci Pollut Res Int
January 2025
College of Materials Science and Engineering, Nanjing Forestry University, Nanjing, 210037, China.
Since its discovery, carbon quantum dots (CDs) have been widely applied in cell imaging, drug delivery, biosensing, and photocatalysis due to their excellent water solubility, chemical stability, fluorescence stability biocompatibility, low toxicity, and preparation cost. However, the low fluorescence yield and poor surface structure limit the application of CDs. Heteroatom doping is considered an ideal method to improve CDs' optical and electrical properties.
View Article and Find Full Text PDFActa Neuropathol Commun
January 2025
Department of Biological Sciences, Purdue University, 915 Mitch Daniels Blvd, West Lafayette, IN, USA.
Dementia refers to an umbrella phenotype of many different underlying pathologies with Alzheimer's disease (AD) being the most common type. Neuropathological examination remains the gold standard for accurate AD diagnosis, however, most that we know about AD genetics is based on Genome-Wide Association Studies (GWAS) of clinically defined AD. Such studies have identified multiple AD susceptibility variants with a significant portion of the heritability unexplained and highlighting the phenotypic and genetic heterogeneity of the clinically defined entity.
View Article and Find Full Text PDFJ Immunother Cancer
January 2025
National Translational Science Center for Molecular Medicine & Department of Cell Biology, Fourth Military Medical University, Xi'an, Shaanxi, China
Background: Clear cell renal cell carcinoma (ccRCC) is the most common histologic type of RCC. However, the spatial and functional heterogeneity of immunosuppressive cells and the mechanisms by which their interactions promote immunosuppression in the ccRCC have not been thoroughly investigated.
Methods: To further investigate the cellular and regional heterogeneity of ccRCC, we analyzed single-cell and spatial transcriptome RNA sequencing data from four patients, which were obtained from samples from multiple regions, including the tumor core, tumor-normal interface, and distal normal tissue.
Adv Biol Regul
December 2024
Faculty of Medicine and Health Technology, Tampere University, Arvo Ylpönkatu 34, 33014, Finland; Institute of Biotechnology, HiLIFE, University of Helsinki, P.O. Box 56, 00014, Finland; Department of Microbiology, Fimlab Laboratories, P.O.Box 66, 33013, Tampere, Finland. Electronic address:
Janus kinases (JAK1-3, TYK2) are critical mediators of cytokine signaling and their role in hematological and inflammatory and autoimmune diseases has sparked widespread interest in their therapeutic targeting. JAKs have unique tandem kinase structure consisting of an active tyrosine kinase domain adjacent to a pseudokinase domain that is a hotspot for pathogenic mutations. The development of JAK inhibitors has focused on the active kinase domain and the developed drugs have demonstrated good clinical efficacy but due to off-target inhibition cause also side-effects and carry a black box warning limiting their use.
View Article and Find Full Text PDFEur J Pharmacol
January 2025
School of Life Science and Engineering, Southwest Jiaotong University, Chengdu 610031, Sichuan province, P.R. China. Electronic address:
FOXM1 is the "Achilles' heel" of cancers and hence the potential therapeutic target for anticancer drug discovery. In this work, we selected high affinity peptides against the protein of human DNA binding domain of FOXM1 (FOXM1-DBD) from the disulfide-constrained, phage displayed random cyclic heptapeptide library Ph.D.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!