Publications by David R Koes | LitMetric

Publications by authors named "David R Koes"

Page 1 of 3

CACHE Challenge #1: Docking with GNINA Is All You Need.

Ian Dunn Somayeh Pirhadi Yao Wang Smmrithi Ravindran Carter Concepcion David Ryan Koes

J Chem Inf Model

December 2024

We describe our winning submission to the first Critical Assessment of Computational Hit-Finding Experiments (CACHE) challenge. In this challenge, 23 participants employed a diverse array of structure-based methods to identify hits to a target with no known ligands. We utilized two methods, pharmacophore search and molecular docking, to identify our initial hit list and compounds for the hit expansion phase.

View Article and Find Full Text PDF

Exploring Discrete Flow Matching for 3D De Novo Molecule Generation.

Ian Dunn David R Koes

ArXiv

November 2024

Deep generative models that produce novel molecular structures have the potential to facilitate chemical discovery. Flow matching is a recently proposed generative modeling framework that has achieved impressive performance on a variety of tasks including those on biomolecular structures. The seminal flow matching framework was developed only for continuous data.

View Article and Find Full Text PDF

PharmRL: Pharmacophore elucidation with Deep Geometric Reinforcement Learning.

Rishal Aggarwal David R Koes

Res Sq

September 2024

Molecular interactions between proteins and their ligands are important for drug design. A pharmacophore consists of favorable molecular interactions in a protein binding site and can be utilized for virtual screening. Pharmacophores are easiest to identify from co-crystal structures of a bound protein-ligand complex.

View Article and Find Full Text PDF

CENsible: Interpretable Insights into Small-Molecule Binding with Context Explanation Networks.

Roshni Bhatt David Ryan Koes Jacob D Durrant

J Chem Inf Model

June 2024

We present a novel and interpretable approach for assessing small-molecule binding using context explanation networks. Given the specific structure of a protein/ligand complex, our CENsible scoring function uses a deep convolutional neural network to predict the contributions of precalculated terms to the overall binding affinity. We show that CENsible can effectively distinguish active vs inactive compounds for many systems.

View Article and Find Full Text PDF

Accelerating Inference in Molecular Diffusion Models with Latent Representations of Protein Structure.

Ian Dunn David Ryan Koes

ArXiv

May 2024

Diffusion generative models have emerged as a powerful framework for addressing problems in structural biology and structure-based drug design. These models operate directly on 3D molecular structures. Due to the unfavorable scaling of graph neural networks (GNNs) with graph size as well as the relatively slow inference speeds inherent to diffusion models, many existing molecular diffusion models rely on coarse-grained representations of protein structure to make training and inference feasible.

View Article and Find Full Text PDF

Sliding Window INteraction Grammar (SWING): a generalized interaction language model for peptide and protein interactions.

Alisa A Omelchenko Jane C Siwek Prabal Chhibbar Sanya Arshad Iliyan Nazarali David R Koes

bioRxiv

May 2024

The explosion of sequence data has allowed the rapid growth of protein language models (pLMs). pLMs have now been employed in many frameworks including variant-effect and peptide-specificity prediction. Traditionally, for protein-protein or peptide-protein interactions (PPIs), corresponding sequences are either co-embedded followed by post-hoc integration or the sequences are concatenated prior to embedding.

View Article and Find Full Text PDF

Mixed Continuous and Categorical Flow Matching for 3D De Novo Molecule Generation.

Ian Dunn David Ryan Koes

ArXiv

April 2024

Deep generative models that produce novel molecular structures have the potential to facilitate chemical discovery. Diffusion models currently achieve state of the art performance for 3D molecule generation. In this work, we explore the use of flow matching, a recently proposed generative modeling framework that generalizes diffusion models, for the task of de novo molecule generation.

View Article and Find Full Text PDF

Structure-Infused Protein Language Models.

Daniel Peñaherrera David Ryan Koes

bioRxiv

April 2024

Embeddings from protein language models (PLM's) capture intricate patterns for protein sequences, enabling more accurate and efficient prediction of protein properties. Incorporating protein structure information as direct input into PLMs results in an improvement on the predictive ability of protein embeddings on downstream tasks. In this work we demonstrate that indirectly infusing structure information into PLMs also leads to performance gains on structure related tasks.

View Article and Find Full Text PDF

BigBind: Learning from Nonstructural Data for Structure-Based Virtual Screening.

Michael Brocidiacono Paul Francoeur Rishal Aggarwal Konstantin I Popov David Ryan Koes

J Chem Inf Model

April 2024

Deep learning methods that predict protein-ligand binding have recently been used for structure-based virtual screening. Many such models have been trained using protein-ligand complexes with known crystal structures and activities from the PDBBind data set. However, because PDBbind only includes 20K complexes, models typically fail to generalize to new targets, and model performance is on par with models trained with only ligand information.

View Article and Find Full Text PDF

Interpreting forces as deep learning gradients improves quality of predicted protein structures.

Jonathan Edward King David Ryan Koes

Biophys J

September 2024

Protein structure predictions from deep learning models like AlphaFold2, despite their remarkable accuracy, are likely insufficient for direct use in downstream tasks like molecular docking. The functionality of such models could be improved with a combination of increased accuracy and physical intuition. We propose a new method to train deep learning protein structure prediction models using molecular dynamics force fields to work toward these goals.

View Article and Find Full Text PDF

Open-ComBind: harnessing unlabeled data for improved binding pose prediction.

Andrew T McNutt David Ryan Koes

J Comput Aided Mol Des

December 2023

Determination of the bound pose of a ligand is a critical first step in many in silico drug discovery tasks. Molecular docking is the main tool for the prediction of non-covalent binding of a protein and ligand system. Molecular docking pipelines often only utilize the information of one ligand binding to the protein despite the commonly held hypothesis that different ligands share binding interactions when bound to the same receptor.

View Article and Find Full Text PDF

Systematic Comparison of Experimental Crystallographic Geometries and Gas-Phase Computed Conformers for Torsion Preferences.

Dakota L Folmsbee David R Koes Geoffrey R Hutchison

J Chem Inf Model

December 2023

We performed exhaustive torsion sampling on more than 3 million compounds using the GFN2-xTB method and performed a comparison of experimental crystallographic and gas-phase conformers. Many conformer sampling methods derive torsional angle distributions from experimental crystallographic data, limiting the torsion preferences to molecules that must be stable, synthetically accessible, and able to be crystallized. In this work, we evaluate the differences in torsional preferences of experimental crystallographic geometries and gas-phase computed conformers from a broad selection of compounds to determine whether torsional angle distributions obtained from semiempirical methods are suitable priors for conformer sampling.

View Article and Find Full Text PDF

Deciphering the Role of Fatty Acid-Metabolizing CYP4F11 in Lung Cancer and Its Potential As a Drug Target.

Huiting Jia Bjoern Brixius Caleb Bocianoski Sutapa Ray David R Koes

Drug Metab Dispos

January 2024

Lung cancer is the leading cause of cancer deaths worldwide. We found that the cytochrome P450 isoform CYP4F11 is significantly overexpressed in patients with lung squamous cell carcinoma. CYP4F11 is a fatty acid -hydroxylase and catalyzes the production of the lipid mediator 20-hydroxyeicosatetraenoic acid (20-HETE) from arachidonic acid.

View Article and Find Full Text PDF

Expanding Training Data for Structure-Based Receptor-Ligand Binding Affinity Regression through Imputation of Missing Labels.

Paul G Francoeur David R Koes

ACS Omega

November 2023

The success of machine learning is, in part, due to a large volume of data available to train models. However, the amount of training data for structure-based molecular property prediction remains limited. The previously described CrossDocked2020 data set expanded the available training data for binding pose classification in a molecular docking setting but did not address expanding the amount of receptor-ligand binding affinity data.

View Article and Find Full Text PDF

CENsible: Interpretable Insights into Small-Molecule Binding with Context Explanation Networks.

Roshni Bhatt David Ryan Koes Jacob D Durrant

bioRxiv

October 2023

We present a novel and interpretable approach for predicting small-molecule binding affinities using context explanation networks (CENs). Given the specific structure of a protein/ligand complex, our CENsible scoring function uses a deep convolutional neural network to predict the contributions of pre-calculated terms to the overall binding affinity. We show that CENsible can effectively distinguish active vs.

View Article and Find Full Text PDF

Conformer Generation for Structure-Based Drug Design: How Many and How Good?

Andrew T McNutt Fatimah Bisiriyu Sophia Song Ananya Vyas Geoffrey R Hutchison David Ryan Koes

J Chem Inf Model

November 2023

Conformer generation, the assignment of realistic 3D coordinates to a small molecule, is fundamental to structure-based drug design. Conformational ensembles are required for rigid-body matching algorithms, such as shape-based or pharmacophore approaches, and even methods that treat the ligand flexibly, such as docking, are dependent on the quality of the provided conformations due to not sampling all degrees of freedom (e.g.

View Article and Find Full Text PDF

PLANTAIN: Diffusion-inspired Pose Score Minimization for Fast and Accurate Molecular Docking.

Michael Brocidiacono Konstantin I Popov David Ryan Koes Alexander Tropsha

ArXiv

July 2023

Molecular docking aims to predict the 3D pose of a small molecule in a protein binding site. Traditional docking methods predict ligand poses by minimizing a physics-inspired scoring function. Recently, a diffusion model has been proposed that iteratively refines a ligand pose.

View Article and Find Full Text PDF

Improving Predictions with a Multitask Convolutional Siamese Network.

Andrew T McNutt David Ryan Koes

J Chem Inf Model

April 2022

The lead optimization phase of drug discovery refines an initial hit molecule for desired properties, especially potency. Synthesis and experimental testing of the small perturbations during this refinement can be quite costly and time-consuming. Relative binding free energy (RBFE, also referred to as ) methods allow the estimation of binding free energy changes after small changes to a ligand scaffold.

View Article and Find Full Text PDF

Generating 3D molecules conditional on receptor binding sites with deep generative models.

Matthew Ragoza Tomohide Masuda David Ryan Koes

Chem Sci

March 2022

The goal of structure-based drug discovery is to find small molecules that bind to a given target protein. Deep learning has been used to generate drug-like molecules with certain cheminformatic properties, but has not yet been applied to generating 3D molecules predicted to bind to proteins by sampling the conditional distribution of protein-ligand binding interactions. In this work, we describe for the first time a deep learning system for generating 3D molecular structures conditioned on a receptor binding site.

View Article and Find Full Text PDF

Virtual Screening with Gnina 1.0.

Jocelyn Sunseri David Ryan Koes

Molecules

December 2021

Virtual screening-predicting which compounds within a specified compound library bind to a target molecule, typically a protein-is a fundamental task in the field of drug discovery. Doing virtual screening well provides tangible practical benefits, including reduced drug development costs, faster time to therapeutic viability, and fewer unforeseen side effects. As with most applied computational tasks, the algorithms currently used to perform virtual screening feature inherent tradeoffs between speed and accuracy.

View Article and Find Full Text PDF

Correction to "SolTranNet-A Machine Learning Tool for Fast Aqueous Solubility Prediction".

Paul G Francoeur David R Koes

J Chem Inf Model

August 2021

View Article and Find Full Text PDF

SidechainNet: An all-atom protein structure dataset for machine learning.

Jonathan Edward King David Ryan Koes

Proteins

November 2021

Despite recent advancements in deep learning methods for protein structure prediction and representation, little focus has been directed at the simultaneous inclusion and prediction of protein backbone and sidechain structure information. We present SidechainNet, a new dataset that directly extends the ProteinNet dataset. SidechainNet includes angle and atomic coordinate information capable of describing all heavy atoms of each protein structure and can be extended by users to include new protein structures as they are released.

View Article and Find Full Text PDF

DeepFrag: a deep convolutional neural network for fragment-based lead optimization.

Harrison Green David R Koes Jacob D Durrant

Chem Sci

May 2021

Machine learning has been increasingly applied to the field of computer-aided drug discovery in recent years, leading to notable advances in binding-affinity prediction, virtual screening, and QSAR. Surprisingly, it is less often applied to lead optimization, the process of identifying chemical fragments that might be added to a known ligand to improve its binding affinity. We here describe a deep convolutional neural network that predicts appropriate fragments given the structure of a receptor/ligand complex.

View Article and Find Full Text PDF

GNINA 1.0: molecular docking with deep learning.

Andrew T McNutt Paul Francoeur Rishal Aggarwal Tomohide Masuda Rocco Meli David Ryan Koes

J Cheminform

June 2021

Molecular docking computationally predicts the conformation of a small molecule when binding to a receptor. Scoring functions are a vital piece of any molecular docking pipeline as they determine the fitness of sampled poses. Here we describe and evaluate the 1.

View Article and Find Full Text PDF

SolTranNet-A Machine Learning Tool for Fast Aqueous Solubility Prediction.

Paul G Francoeur David R Koes

J Chem Inf Model

June 2021

While accurate prediction of aqueous solubility remains a challenge in drug discovery, machine learning (ML) approaches have become increasingly popular for this task. For instance, in the Second Challenge to Predict Aqueous Solubility (SC2), all groups utilized machine learning methods in their submissions. We present SolTranNet, a molecule attention transformer to predict aqueous solubility from a molecule's SMILES representation.

View Article and Find Full Text PDF

A PHP Error was encountered

Severity: Notice

Message: fwrite(): Write of 34 bytes failed with errno=28 No space left on device

Filename: drivers/Session_files_driver.php

Line Number: 272

Backtrace:

A PHP Error was encountered

Severity: Warning

Message: session_write_close(): Failed to write session data using user defined save handler. (session.save_path: /var/lib/php/sessions)

Filename: Unknown

Line Number: 0

Backtrace: