Publications by authors named "J Bajorath"

While data curation principles and practices are a major topic in data science, they are often not explicitly considered in machine learning (ML) applications in chemistry. We have been interested in evaluating the potential effects of data curation on the performance of molecular ML models. Therefore, a sequential curation scheme was developed for compounds and activity data, and different ML classification models were generated at increasing data confidence levels and evaluated.

View Article and Find Full Text PDF
Article Synopsis
  • Compound optimization in medicinal chemistry involves creating series of analogues to study structure-activity relationships (SARs), with a focus on improving potency.* -
  • A new computational method integrates a transformer chemical language model (CLM) with a SAR matrix (SARM) to generate potent analogues with modifications at various sites.* -
  • This methodology demonstrated its effectiveness by accurately predicting known potent compounds and producing diverse series through structural and substituent adjustments.*
View Article and Find Full Text PDF

The Shapley value formalism from cooperative game theory was adapted to explain predictions of machine learning models. Here, we present a protocol to calculate and compare exact Shapley values for support vector machine models with commonly used kernels and binary input features. We describe steps for installing software, preparing data, and calculating Shapley values with customizable Python scripts.

View Article and Find Full Text PDF

Over the past ~ 25 years, chemoinformatics has evolved as a scientific discipline, with a strong foundation in pharmaceutical research and scientific roots that can be traced back to the late 1950s. It covers a wide methodological spectrum and is perhaps best positioned in the greater context of chemical information science. Herein, the chemoinformatics discipline is delineated, characteristic (and partly problematic) features are discussed, and a global view of the field is provided, emphasizing key developments.

View Article and Find Full Text PDF

In drug discovery, human protein kinases (PKs) represent one of the major target classes due to their central role in cellular signaling, implication in various diseases as a consequence of deregulated signaling, and notable druggability. Individual PKs and their disease biology have been explored to different degrees, giving rise to heterogeneous functional knowledge and disease associations across the human kinome. The U.

View Article and Find Full Text PDF