Publications by authors named "Darren V S Green"

The design of compounds during hit-to-lead often seeks to explore a vector from a core scaffold to form additional interactions with the target protein. A rational approach to this is to probe the region of a protein accessed by a vector with a systematic placement of pharmacophore features in 3D, particularly when bound structures are not available. Herein, we present bbSelect, an open-source tool built to map the placements of pharmacophore features in 3D Euclidean space from a library of R-groups, employing partitioning to drive a diverse and systematic selection to a user-defined size.

View Article and Find Full Text PDF

One of the main problems that the drug discovery research field confronts is to identify small molecules, modulators of protein function, which are likely to be therapeutically useful. Common practices rely on the screening of vast libraries of small molecules (often 1-2 million molecules) in order to identify a molecule, known as a lead molecule, which specifically inhibits or activates the protein function. To search for the lead molecule, we investigate the molecular structure, which generally consists of an extremely large number of fragments.

View Article and Find Full Text PDF

Malaria is a disease affecting hundreds of millions of people across the world, mainly in developing countries and especially in sub-Saharan Africa. It is the cause of hundreds of thousands of deaths each year and there is an ever-present need to identify and develop effective new therapies to tackle the disease and overcome increasing drug resistance. Here, we extend a previous study in which a number of partners collaborated to develop a consensus in silico model that can be used to identify novel molecules that may have antimalarial properties.

View Article and Find Full Text PDF

Machine learning approaches promise to accelerate and improve success rates in medicinal chemistry programs by more effectively leveraging available data to guide a molecular design. A key step of an automated computational design algorithm is molecule generation, where the machine is required to design high-quality, drug-like molecules within the appropriate chemical space. Many algorithms have been proposed for molecular generation; however, a challenge is how to assess the validity of the resulting molecules.

View Article and Find Full Text PDF

Deep learning approaches have become popular in recent years in the field of molecular design. While a variety of different methods are available, it is still a challenge to assess and compare their performance. A particularly promising approach for automated drug design is to use recurrent neural networks (RNNs) as SMILES generators and train them with the learning procedure called "transfer learning".

View Article and Find Full Text PDF

The original version of this article unfortunately contained some mistakes in the references.

View Article and Find Full Text PDF

This paper introduces BRADSHAW (Biological Response Analysis and Design System using an Heterogenous, Automated Workflow), a system for automated molecular design which integrates methods for chemical structure generation, experimental design, active learning and cheminformatics tools. The simple user interface is designed to facilitate access to large scale automated design whilst minimising software development required to introduce new algorithms, a critical requirement in what is a very fast moving field. The system embodies a philosophy of automation, best practice, experimental design and the use of both traditional cheminformatics and modern machine learning algorithms.

View Article and Find Full Text PDF

High-throughput screening (HTS) hits include compounds with undesirable properties. Many filters have been described to identify such hits. Notably, pan-assay interference compounds (PAINS) has been adopted by the community as the standard term to refer to such filters, and very useful guidelines have been adopted by the American Chemical Society (ACS) and subsequently triggered a healthy scientific debate about the pitfalls of draconian use of filters.

View Article and Find Full Text PDF

In this work, two freely available web-based interactive computational tools that facilitate the analysis and interpretation of protein-ligand interaction data are described. Firstly, WONKA, which assists in uncovering interesting and unusual features (for example residue motions) within ensembles of protein-ligand structures and enables the facile sharing of observations between scientists. Secondly, OOMMPPAA, which incorporates protein-ligand activity data with protein-ligand structural data using three-dimensional matched molecular pairs.

View Article and Find Full Text PDF

The development of new antimalarial therapies is essential, and lowering the barrier of entry for the screening and discovery of new lead compound classes can spur drug development at organizations that may not have large compound screening libraries or resources to conduct high-throughput screens. Machine learning models have been long established to be more robust and have a larger domain of applicability with larger training sets. Screens over multiple data sets to find compounds with potential malaria blood stage inhibitory activity have been used to generate multiple Bayesian models.

View Article and Find Full Text PDF

The acronym "CADD" is often used interchangeably to refer to "Computer Aided Drug Discovery" and "Computer Aided Drug Design". While the former definition implies the use of a computer to impact one or more aspects of discovering a drug, in this paper we contend that computational chemists are most effective when they enable teams to apply true design principles as they strive to create medicines to treat human disease. We argue that teams must bring to bear multiple sub-disciplines of computational chemistry in an integrated manner in order to utilize these principles to address the multi-objective nature of the drug discovery problem.

View Article and Find Full Text PDF

In an attempt to seek increased understanding of compound attributes that influence successful drug pipeline progression, GlaxoSmithKline's portfolio of oral candidates was compared with reference sets of marketed oral drugs. The approach differs from other attrition studies by explicitly focusing on choosing 'the right compound' by applying relevant, experimentally derived properties. The analysis led to four proposed compound quality categories, created by combining specific criteria for three measures: dose, solubility and the property forecast index, a composite measure of lipophilicity using chromatographically determined LogD and aromaticity.

View Article and Find Full Text PDF

There is an ever increasing resource in terms of both structural information and activity data for many protein targets. In this paper we describe OOMMPPAA, a novel computational tool designed to inform compound design by combining such data. OOMMPPAA uses 3D matched molecular pairs to generate 3D ligand conformations.

View Article and Find Full Text PDF

We describe the QSAR Workbench, a system for the building and analysis of QSAR models. The system is built around the Pipeline Pilot workflow tool and provides access to a variety of model building algorithms for both continuous and categorical data. Traditionally models are built on a one by one basis and fully exploring the model space of algorithms and descriptor subsets is a time consuming basis.

View Article and Find Full Text PDF

Here, we review the performance of chromatographic hydrophobicity measurements in a data set of 100,000 GlaxoSmithKline compounds, demonstrating the advantages of the method over octanol-water partitioning and highlighting new insights for drug discovery. The value of chromatographic measurements, versus other hydrophobicity estimates, was supported by improved relationships with solubility, permeation, cytochrome P450s, intrinsic clearance, hERG binding and promiscuity. We also observed marked differentiation of the relative influence of intrinsic and effective hydrophobicity.

View Article and Find Full Text PDF

Drug toxicity is a major cause of late-stage product attrition. During lead identification and optimization phases little information is typically available about which molecules might have safety concerns. A system was built linking chemistry, preclinical and human safety information, enabling scientists to lever safety knowledge across multiple disciplines.

View Article and Find Full Text PDF

High-throughput screening (HTS) has been postulated in several quarters to be a contributory factor to the decline in productivity in the pharmaceutical industry. Moreover, it has been blamed for stifling the creativity that drug discovery demands. In this article, we aim to dispel these myths and present the case for the use of HTS as part of a proven scientific tool kit, the wider use of which is essential for the discovery of new chemotypes.

View Article and Find Full Text PDF

Traditional lead optimization projects involve long synthesis and testing cycles, favoring extensive structure-activity relationship (SAR) analysis and molecular design steps, in an attempt to limit the number of cycles that a project must run to optimize a development candidate. Microfluidic-based chemistry and biology platforms, with cycle times of minutes rather than weeks, lend themselves to unattended autonomous operation. The bottleneck in the lead optimization process is therefore shifted from synthesis or test to SAR analysis and design.

View Article and Find Full Text PDF

Malaria is a devastating infection caused by protozoa of the genus Plasmodium. Drug resistance is widespread, no new chemical class of antimalarials has been introduced into clinical practice since 1996 and there is a recent rise of parasite strains with reduced sensitivity to the newest drugs. We screened nearly 2 million compounds in GlaxoSmithKline's chemical library for inhibitors of P.

View Article and Find Full Text PDF

Virtual screening of virtual libraries (VSVL) is a rapidly changing area of research. Great efforts are being made to produce better algorithms, selection methods and infrastructure. Yet, the number of successful examples in the literature is not impressive, although the quality of work certainly is high.

View Article and Find Full Text PDF

This paper addresses a major issue in library design, namely how to efficiently optimize the library size (number of products) and configuration (number of reagents at each position) simultaneously with other properties such as diversity, cost, and drug-like physicochemical property profiles. These objectives are often in competition, for example, minimizing the number of reactants while simultaneously maximizing diversity, and thus present difficulties for traditional optimization methods such as genetic algorithms and simulated annealing. Here, a multiobjective genetic algorithm (MOGA) is used to vary library size and configuration simultaneously with other library properties.

View Article and Find Full Text PDF

Deriving quantitative structure-activity relationship (QSAR) models that are accurate, reliable, and easily interpretable is a difficult task. In this study, two new methods have been developed that aim to find useful QSAR models that represent an appropriate balance between model accuracy and complexity. Both methods are based on genetic programming (GP).

View Article and Find Full Text PDF