Publications by authors named "Jurgen Bajorath"

Fragment formal concept analysis (FragFCA) for compound classification: Signature fragment combinations for compound classes with closely related biological activity were identified using FragFCA. These combinations are used to accurately classify active test compounds on the basis of fragment mapping. FragFCA can extract class-specific fragment combinations from compounds active against different target families that have signature character and practical utility in compound classification and database searching.

View Article and Find Full Text PDF

Support vector machine (SVM) database search strategies are presented that aim at the identification of small molecule ligands for targets for which no ligand information is currently available. In pharmaceutical research and chemical biology, this situation is faced, for example, when studying orphan targets or newly identified members of protein families. To investigate methods for de novo ligand identification in the absence of known three-dimensional target structures or active molecules, we have focused on combining sequence and ligand information for closely and distantly related proteins.

View Article and Find Full Text PDF

The Pipeline Pilot extended connectivity fingerprints (ECFPs) are currently among the most popular similarity search tools in drug discovery settings. ECFPs do not have a fixed bit string format but generate variable numbers of structural features for individual test molecules. This variable string design makes ECFP representations amenable to compound-class-directed modification.

View Article and Find Full Text PDF

The identification of small chemical compounds that are selective for a target protein over one or more closely related members of the same family is of high relevance for applications in chemical biology. Conventional 2D similarity searching using known selective molecules as templates has recently been found to preferentially detect selective over non-selective and inactive database compounds. To improve the initially observed search performance, we have attempted to use 2D fingerprints as descriptors for support vector machine (SVM)-based selectivity searching.

View Article and Find Full Text PDF

For the computational exploration of structure-selectivity relationships, a compound selectivity system consisting of 243 antagonists of ionotropic glutamate ligand-gated ion channels was designed. Selected antagonists were organized in nine different selectivity sets. In systematic selectivity search calculations utilizing these data sets, structural fingerprints produced a significant enrichment of selective compounds over non-selective molecules and database decoys.

View Article and Find Full Text PDF

Blockade of the human ether-a-go-go related gene potassium channel is regarded as a major cause of drug toxicity and associated with severe cardiac side-effects. A variety of in silico models have been reported to aid in the identification of compounds blocking the human ether-a-go-go related gene channel. Herein, we present a classification approach for the detection of diverse human ether-a-go-go related gene blockers that combines cluster analysis of training data, feature selection and support vector machine learning.

View Article and Find Full Text PDF

A computational molecular network analysis of various high-throughput screening (HTS) data sets including inhibition assays and cell-based screens organizes screening hits according to different local structure-activity relationships (SARs). The resulting network representations make it possible to focus on different local SAR environments in screening data. We have designed a simple scoring function accounting for similarity and potency relationships among hits that identifies SAR pathways leading from active compounds in different SAR contexts to key compounds forming activity cliffs.

View Article and Find Full Text PDF

Background: Chemical biology and chemogenomics are rapidly evolving disciplines at interfaces between chemistry and the life sciences and are highly interdisciplinary in nature. Chemogenomics has a strong conceptional link to modern drug discovery research, whereas chemical biology focuses more on the use of small molecules as probes for exploring biological functions, rather than drug candidates. However, the boundaries between these areas are fluid, as they should be, given their strong interdisciplinary orientation.

View Article and Find Full Text PDF

The design and evaluation of structural key-type fingerprints is reported that consist of only 10-30 substructures isolated from randomly generated fragment populations of different classes of active compounds. To identify minimal sets of fragments that carry substantial compound class-specific information, fragment frequency calculations are applied to guide fingerprint generation. These compound class-directed and extremely small structural fingerprints push the design of so-called mini-fingerprints to the limit and are the shortest bit string fingerprints reported to date.

View Article and Find Full Text PDF

We report the development and application of the Topological Fragment Index (ToFI), a measure for the complexity of the topological environment of defined molecular fragments in active compounds. On the basis of ToFI calculations, RECAP fragments are organized in dependency hierarchies that capture fragment co-occurrence and facilitate the identification of topology clusters and activity class characteristic RECAP fragments. By combining structural and topological environment information through ToFI, RECAP fragments that are a signature of compounds active against one of several closely related targets are consistently identified.

View Article and Find Full Text PDF

To incorporate protein-ligand interaction information into conventional two-dimensional (2D) fingerprint searching, interacting fragments of active compounds were extracted from X-ray structures of protein-ligand complexes and encoded as structural key-type fingerprints. Similarity search calculations with fingerprints derived from interacting fragments were compared to fingerprints of complete ligands and control fragments. In these calculations, fingerprints of interacting fragments produced significantly higher compound recall than other fingerprints.

View Article and Find Full Text PDF

A methodology is introduced to assign energy-based scores to two-dimensional (2D) structural features based on three-dimensional (3D) ligand-target interaction information and utilize interaction-annotated features in virtual screening. Database molecules containing such fragments are assigned cumulative scores that serve as a measure of similarity to active reference compounds. The Interaction Annotated Structural Features (IASF) method is applied to mine five high-throughput screening (HTS) data sets and often identifies more hits than conventional fragment-based similarity searching or ligand-protein docking.

View Article and Find Full Text PDF

The study of structure-activity relationships (SARs) of small molecules is of fundamental importance in medicinal chemistry and drug design. Here, we introduce an approach that combines the analysis of similarity-based molecular networks and SAR index distributions to identify multiple SAR components present within sets of active compounds. Different compound classes produce molecular networks of distinct topology.

View Article and Find Full Text PDF

Molecular substructures and fragment descriptors are important tools for many computational applications in medicinal chemistry and drug design. Here we briefly review the history of molecular fragmentation methods and describe the currently most widely used approaches. Then we focus on random fragmentation methods that have only recently been introduced and discuss selected applications.

View Article and Find Full Text PDF

The approach termed Determination and Mapping of Activity-Specific Descriptor Value Ranges (MAD) is a conceptually novel molecular similarity method for the identification of active compounds. MAD is based on mapping of compounds to different (multiple) activity class-selective descriptor value ranges. It was recently developed in our laboratory and successfully applied in initial virtual screening trials.

View Article and Find Full Text PDF

The introduction of molecular similarity analysis in the early 1990s has catalyzed the development of many small-molecule-based similarity methods to mine large compound databases for novel active molecules. These efforts have profoundly influenced the field of computer-aided drug discovery and substantially widened the spectrum of available ligand-based virtual screening approaches. However, the principles underlying the computational assessment of molecular similarity are much more multifaceted and complex than it might appear at first glance.

View Article and Find Full Text PDF

Fingerprints are molecular bit string representations and are among the most popular descriptors for similarity searching. In key-type fingerprints, each bit position monitors the presence or absence of a prespecified chemical or structural feature. In contrast to hashed fingerprints, this keyed design makes it possible to evaluate individual bit positions and the associated structural features during similarity searching.

View Article and Find Full Text PDF

To systematically compare bioactive and theoretically derived compound conformations, we have analyzed 18 different sets of active small molecules with experimentally determined binding conformations and modeled conformers using a pattern recognition approach. Compound class-specific descriptor value range patterns that accurately distinguish bioactive conformations from other low-energy conformers were identified for all 18 compound classes. Discriminatory patterns were often chemically intuitive and could be well rationalized on the basis of X-ray structures of the protein-ligand complexes.

View Article and Find Full Text PDF

We introduce fragment formal concept analysis (FragFCA) to study complex relationships between fragments in active compounds taking potency information into account. Fragment combinations that are unique to active or highly potent compounds or that are shared by molecules having different or overlapping activity profiles are systematically identified using chemically intuitive queries of varying complexity. The methodology is applied to analyze fragment distributions in antagonists of seven G protein coupled receptor targets and identify signature fragments.

View Article and Find Full Text PDF

Similarity searching is often used to preselect compounds for docking, thereby decreasing the size of screening databases. However, integrated structure- and ligand-based screening schemes are rare at present. Docking and similarity search calculations using 2D fingerprints were carried out in a comparative manner on nine target enzymes, for which significant numbers of diverse inhibitors could be obtained.

View Article and Find Full Text PDF

We report the design of target-selective chemical spaces using CA-DynaMAD, a mapping algorithm that generates and navigates flexible space representations for the identification of active or selective compounds. The algorithm iteratively increases the dimensionality of reference spaces in a controlled manner by evaluating a single descriptor per iteration. For seven sets of closely related biogenic amine G protein coupled receptor (GPCR) antagonists with different selectivity, target-selective reference spaces were designed and used to identify selective compounds by screening a biologically annotated database.

View Article and Find Full Text PDF

Substructures are among the most preferred molecular descriptors in chemoinformatics and medicinal chemistry. Conventional substructure-type descriptors are typically the result of well-defined design strategies. Previously, we have introduced Activity Class Characteristic Substructures (ACCS) derived from randomly generated molecular fragment populations and described their utility in similarity searching.

View Article and Find Full Text PDF

A new type of molecular representation is introduced that is based on activity class characteristic substructures extracted from random fragment populations. Mapping of characteristic substructures is used to determine atom match rates in active molecules. Comparison of match rates of bonded atoms defines a hierarchical molecular fragmentation scheme.

View Article and Find Full Text PDF

The identification of small molecules that are selective for individual targets within target families is an important task in chemical biology. We aim at the development of computational approaches for the study of structure-selectivity relationships and prediction of target-selective ligands. In previous studies, we have introduced the concept of selectivity searching.

View Article and Find Full Text PDF