Similarity searching using a single bioactive reference structure is a well-established technique for accessing chemical structure databases. This paper describes two extensions of the basic approach. First, we discuss the use of group fusion to combine the results of similarity searches when multiple reference structures are available.
View Article and Find Full Text PDFWe test the hypothesis that fusing the outputs of similarity searches based on a single bioactive reference structure and on its nearest neighbors (of unknown activity) is more effective (in terms of numbers of high-ranked active structures) than a similarity search involving just the reference structure. This turbo similarity searching approach provides a simple way to enhance the effectiveness of simulated virtual screening searches of the MDL Drug Data Report database.
View Article and Find Full Text PDFThe NIBR (Novartis Institutes for BioMedical Research) compound collection enrichment and enhancement project integrates corporate internal combinatorial compound synthesis and external compound acquisition activities in order to build up a comprehensive screening collection for a modern drug discovery organization. The main purpose of the screening collection is to supply the Novartis drug discovery pipeline with hit-to-lead compounds for today's and the future's portfolio of drug discovery programs, and to provide tool compounds for the chemogenomics investigation of novel biological pathways and circuits. As such, it integrates designed focused and diversity-based compound sets from the synthetic and natural paradigms able to cope with druggable and currently deemed undruggable targets and molecular interaction modes.
View Article and Find Full Text PDFThis paper reports a detailed comparison of a range of different types of 2D fingerprints when used for similarity-based virtual screening with multiple reference structures. Experiments with the MDL Drug Data Report database demonstrate the effectiveness of fingerprints that encode circular substructure descriptors generated using the Morgan algorithm. These fingerprints are notably more effective than fingerprints based on a fragment dictionary, on hashing and on topological pharmacophores.
View Article and Find Full Text PDFFor the clustering of chemical structures that are described by the Similog, ISIS count, and ISIS binary fingerprints, we propose a sequential superparamagnetic clustering approach. To appropriately handle nonbinary feature keys, we introduce an extension of the binary Tanimoto similarity measure. In our applications, data sets composed of structures from seven chemically distinct compound classes are evaluated and correctly clustered.
View Article and Find Full Text PDFFingerprint-based similarity searching is widely used for virtual screening when only a single bioactive reference structure is available. This paper reviews three distinct ways of carrying out such searches when multiple bioactive reference structures are available: merging the individual fingerprints into a single combined fingerprint; applying data fusion to the similarity rankings resulting from individual similarity searches; and approximations to substructural analysis. Extended searches on the MDL Drug Data Report database suggest that fusing similarity scores is the most effective general approach, with the best individual results coming from the binary kernel discrimination technique.
View Article and Find Full Text PDFThe technology underpinning high-throughput docking (HTD) has developed over the past few years to where it has become a vital tool in modern drug discovery. Although the performance of various docking algorithms is adequate, the ability to accurately and consistently rank compounds using a scoring function remains problematic. We show that by employing a simple machine learning method (naïve Bayes) it is possible to significantly overcome this deficiency.
View Article and Find Full Text PDFThe noise level of a high-throughput screening (HTS) experiment depends on various factors such as the quality and robustness of the assay itself and the quality of the robotic platform. Screening of compound mixtures is noisier than screening single compounds per well. A classification model based on naïve Bayes (NB) may be used to enrich such data.
View Article and Find Full Text PDFSuccessful treatment of beta-thalassemia requires two key elements: blood transfusion and iron chelation. Regular blood transfusions considerably expand the lifespan of patients, however, without the removal of the consequential accumulation of body iron, few patients live beyond their second decade. In 1963, the introduction of desferrioxamine (DFO), a hexadentate chelator, marked a breakthrough in the treatment of beta-thalassemia.
View Article and Find Full Text PDFIn this study we evaluate how far the scope of similarity searching can be extended to identify not only ligands binding to the same target as the reference ligand(s) but also ligands of other homologous targets without initially known ligands. This "homology-based similarity searching" requires molecular representations reflecting the ability of a molecule to interact with target proteins. The Similog keys, which are introduced here as a new molecular representation, were designed to fulfill such requirements.
View Article and Find Full Text PDF