Bioactive compound design based on natural product (NP) structure may be limited because of partial coverage of NP-like chemical space and biological target space. These limitations can be overcome by combining NP-centered strategies with fragment-based compound design through combination of NP-derived fragments to afford structurally unprecedented "pseudo-natural products" (pseudo-NPs). The design, synthesis, and biological evaluation of a collection of indomorphan pseudo-NPs that combine biosynthetically unrelated indole- and morphan-alkaloid fragments are described.
View Article and Find Full Text PDFNatural products (NPs) inspire the design and synthesis of novel biologically relevant chemical matter, for instance through biology-oriented synthesis (BIOS). However, BIOS is limited by the partial coverage of NP-like chemical space by the guiding NPs. The design and synthesis of "pseudo NPs" overcomes these limitations by combining NP-inspired strategies with fragment-based compound design through de novo combination of NP-derived fragments to unprecedented compound classes not accessible through biosynthesis.
View Article and Find Full Text PDFThe principles guiding the design and synthesis of bioactive compounds based on natural product (NP) structure, such as biology-oriented synthesis (BIOS), are limited by their partial coverage of the NP-like chemical space of existing NPs and retainment of bioactivity in the corresponding compound collections. Here we propose and validate a concept to overcome these limitations by de novo combination of NP-derived fragments to structurally unprecedented 'pseudo natural products'. Pseudo NPs inherit characteristic elements of NP structure yet enable the efficient exploration of areas of chemical space not covered by NP-derived chemotypes, and may possess novel bioactivities.
View Article and Find Full Text PDFThe limited structural diversity that a compound library represents severely restrains the discovery of bioactive small molecules for medicinal chemistry and chemical biology research, and thus calls for developing new divergent synthetic approaches to structurally diverse and complex scaffolds. Here we present a de novo branching cascades approach wherein simple primary substrates follow different cascade reactions to create various distinct molecular frameworks in a scaffold diversity phase. Later, the scaffold elaboration phase introduces further complexity to the scaffolds by creating a number of chiral centres and incorporating new hetero- or carbocyclic rings.
View Article and Find Full Text PDFSAR studies were performed on a series of 2-arylamido-5,7-dihydro-4H-thieno[2,3-c]pyran-3-carboxamide derivatives as cannabinoid receptor agonists. Starting from a HTS hit both potency and selectivity could be improved. Modifications to the thiophene fusion and C-3 amides were studied.
View Article and Find Full Text PDFDiversity selection is a common task in early drug discovery. One drawback of current approaches is that usually only the structural diversity is taken into account, therefore, activity information is ignored. In this article, we present a modified version of diversity selection, which we term Maximum-Score Diversity Selection, that additionally takes the estimated or predicted activities of the molecules into account.
View Article and Find Full Text PDFThe goal of this study was to adapt a recently proposed linear large-scale support vector machine to large-scale binary cheminformatics classification problems and to assess its performance on various benchmarks using virtual screening performance measures. We extended the large-scale linear support vector machine library LIBLINEAR with state-of-the-art virtual high-throughput screening metrics to train classifiers on whole large and unbalanced data sets. The formulation of this linear support machine has an excellent performance if applied to high-dimensional sparse feature vectors.
View Article and Find Full Text PDFA large variety of log P calculation methods failed to produce sufficient accuracy in log P prediction for two in-house datasets of more than 96000 compounds contrary to their significantly better performances on public datasets. The minimum Root Mean Squared Error (RMSE) of 1.02 and 0.
View Article and Find Full Text PDFWe first review the state-of-the-art in development of log P prediction approaches falling in two major categories: substructure-based and property-based methods. Then, we compare the predictive power of representative methods for one public (N = 266) and two in house datasets from Nycomed (N = 882) and Pfizer (N = 95809). A total of 30 and 18 methods were tested for public and industrial datasets, respectively.
View Article and Find Full Text PDFThe Compressed Feature Matrix (CFM) is a new molecular descriptor for adaptive similarity searching. Depending on the requirements, it is based on a distance or geometry matrix. Thus, the CFM permits topological and three-dimensional comparisons of molecules.
View Article and Find Full Text PDF