We have analyzed 40 different databases ranging in size from a few thousand to nearly 100 million molecules, comprising a total of over 210 million structures, for their tautomeric conflicts. A tautomeric conflict is defined as an occurrence of two or more structures within a data set identified by the tautomeric rules applied as being tautomers of each other. We tested a total of 119 detailed tautomeric transform rules expressed as SMIRKS, out of which 79 yielded at least one conflict.
View Article and Find Full Text PDFExpert Opin Drug Discov
September 2024
Although the size of virtual libraries of synthesizable compounds is growing rapidly, we are still enumerating only tiny fractions of the drug-like chemical universe. Our capability to mine these newly generated libraries also lags their growth. That is why fragment-based approaches that utilize on-demand virtual combinatorial libraries are gaining popularity in drug discovery.
View Article and Find Full Text PDFDesigning new medicines more cheaply and quickly is tightly linked to the quest of exploring chemical space more widely and efficiently. Chemical space is monumentally large, but recent advances in computer software and hardware have enabled researchers to navigate virtual chemical spaces containing billions of chemical structures. This review specifically concerns collections of many millions or even billions of enumerated chemical structures as well as even larger chemical spaces that are not fully enumerated.
View Article and Find Full Text PDFGermline antibodies, the initial set of antibodies produced by the immune system, are critical for host defense, and information about their binding properties can be useful for designing vaccines, understanding the origins of autoantibodies, and developing monoclonal antibodies. Numerous studies have found that germline antibodies are polyreactive with malleable, flexible binding pockets. While insightful, it remains unclear how broadly this model applies, as there are many families of antibodies that have not yet been studied.
View Article and Find Full Text PDFComputational methods to predict molecular properties regarding safety and toxicology represent alternative approaches to expedite drug development, screen environmental chemicals, and thus significantly reduce associated time and costs. There is a strong need and interest in the development of computational methods that yield reliable predictions of toxicity, and many approaches, including the recently introduced deep neural networks, have been leveraged towards this goal. Herein, we report on the collection, curation, and integration of data from the public data sets that were the source of the ChemIDplus database for systemic acute toxicity.
View Article and Find Full Text PDFIn the past two decades a lot of different formats for molecules and reactions have been created. These formats were mostly developed for the purposes of identifiers, representation, classification, analysis and data exchange. A lot of efforts have been made on molecule formats but only few for reactions where the endeavors have been made mostly by companies leading to proprietary formats.
View Article and Find Full Text PDFDue to its antiangiogenic and anti-immunomodulatory activity, thalidomide continues to be of clinical interest despite its teratogenic actions, and efforts to synthesize safer, clinically active thalidomide analogs are continually underway. In this study, a cohort of 27 chemically diverse thalidomide analogs was evaluated for antiangiogenic activity in an ex vivo rat aorta ring assay. The protein cereblon has been identified as the target for thalidomide, and in silico pharmacophore analysis and molecular docking with a crystal structure of human cereblon were used to investigate the cereblon binding abilities of the thalidomide analogs.
View Article and Find Full Text PDFWe have made available a database of over 1 billion compounds predicted to be easily synthesizable, called Synthetically Accessible Virtual Inventory (SAVI). They have been created by a set of transforms based on an adaptation and extension of the CHMTRN/PATRAN programming languages describing chemical synthesis expert knowledge, which originally stem from the LHASA project. The chemoinformatics toolkit CACTVS was used to apply a total of 53 transforms to about 150,000 readily available building blocks (enamine.
View Article and Find Full Text PDFWe have adopted and extended the CHMTRN language and used it for the knowledge base of a computer program to generate a large database of synthetically accessible, drug-like chemical structures, the Synthetically Accessible Virtual Inventory (SAVI) Database. CHMTRN is a powerful language originally developed in the LHASA (Logic and Heuristics Applied to Synthetic Analysis) project at Harvard University and used together with the chemical pattern description language, PATRAN, to describe chemical retro-reactions. The languages have proven to be useful beyond the design of retrosynthetic routes and have the potential for much wider use in chemistry; this paper describes CHMTRN and PATRAN as now reimplemented for the forward-synthetic SAVI project but able to describe both forward and retro-reactions.
View Article and Find Full Text PDFWe have collected 86 different transforms of tautomeric interconversions. Out of those, 54 are for prototropic (non-ring-chain) tautomerism, 21 for ring-chain tautomerism, and 11 for valence tautomerism. The majority of these rules have been extracted from experimental literature.
View Article and Find Full Text PDFWe report a database of tautomeric structures that contains 2819 tautomeric tuples extracted from 171 publications. Each tautomeric entry has been annotated with experimental conditions reported in the respective publication, plus bibliographic details, structural identifiers (e.g.
View Article and Find Full Text PDFDespite the achievements of antiretroviral therapy, discovery of new anti-HIV medicines remains an essential task because the existing drugs do not provide a complete cure for the infected patients, exhibit severe adverse effects, and lead to the appearance of resistant strains. To predict the interaction of drug-like compounds with multiple targets for HIV treatment, ligand-based drug design approach is widely applied. In this study, we evaluated the possibilities and limitations of (Q)SAR analysis aimed at the discovery of novel antiretroviral agents inhibiting the vital HIV enzymes.
View Article and Find Full Text PDFExisting data on structures and biological activities are limited and distributed unevenly across distinct molecular targets and chemical compounds. The question arises if these data represent an unbiased sample of the general population of chemical-biological interactions. To answer this question, we analyzed ChEMBL data for 87,583 molecules tested against 919 protein targets using supervised and unsupervised approaches.
View Article and Find Full Text PDFThe C-terminal binding protein (CtBP) is an NADH-dependent dimeric family of nuclear proteins that scaffold interactions between transcriptional regulators and chromatin-modifying complexes. Its association with poor survival in several cancers implicates CtBP as a promising target for pharmacological intervention. We employed computer-assisted drug design to search for CtBP inhibitors, using quantitative structure-activity relationship (QSAR) modeling and docking.
View Article and Find Full Text PDFA lot of high quality data on the biological activity of chemical compounds are required throughout the whole drug discovery process: from development of computational models of the structure-activity relationship to experimental testing of lead compounds and their validation in clinics. Currently, a large amount of such data is available from databases, scientific publications, and patents. Biological data are characterized by incompleteness, uncertainty, and low reproducibility.
View Article and Find Full Text PDFMotivation: Identification of new molecules promising for treatment of HIV-infection and HIV-associated disorders remains an important task in order to provide safer and more effective therapies. Utilization of prior knowledge by application of computer-aided drug discovery approaches reduces time and financial expenses and increases the chances of positive results in anti-HIV R&D. To provide the scientific community with a tool that allows estimating of potential agents for treatment of HIV-infection and its comorbidities, we have created a freely-available web-resource for prediction of relevant biological activities based on the structural formulae of drug-like molecules.
View Article and Find Full Text PDFSubatomic resolution macromolecular crystallography has been revealing the most fascinating details of macromolecular structures for many years. This most extreme form of macromolecular crystallography is going through rapid changes. A new generation of superbrilliant X-ray sources and detectors is facilitating the rapid acquisition of high-quality datasets.
View Article and Find Full Text PDFDespite significant advances in the application of highly active antiretroviral therapy, the development of new drugs for the treatment of HIV infection remains an important task because the existing drugs do not provide a complete cure, cause serious side effects and lead to the emergence of resistance. In 2015, a consortium of American and European scientists and specialists launched a project to create the SAVI (Synthetically Accessible Virtual Inventory) library. Its 2016 version of over 283 million structures of new easily synthesizable organic molecules, each annotated with a proposed synthetic route, were generated for the purpose of searching for safer and more potent pharmacological substances.
View Article and Find Full Text PDFNon-B DNA structures represent intriguing and challenging targets for small molecules. For example, the promoter of the oncogene contains multiple G-quadruplex and i-motif structures, atypical globular folds that serve as molecular switches for gene expression. Of the two, i-motif structures are far less studied.
View Article and Find Full Text PDFDiscovery of new pharmaceutical substances is currently boosted by the possibility of utilization of the Synthetically Accessible Virtual Inventory (SAVI) library, which includes about 283 million molecules, each annotated with a proposed synthetic one-step route from commercially available starting materials. The SAVI database is well-suited for ligand-based methods of virtual screening to select molecules for experimental testing. In this study, we compare the performance of three approaches for the analysis of structure-activity relationships that differ in their criteria for selecting of "active" and "inactive" compounds included in the training sets.
View Article and Find Full Text PDFPurpose Although low-molecular-weight heparin (LMWH) remains the standard of care, factor Xa inhibitors such as rivaroxaban may serve as an alternative treatment for venous thromboembolism (VTE) in patients with active malignancy. The purpose of the analysis was to evaluate outcomes of VTE management in cancer patients treated with rivaroxaban compared to enoxaparin. Methods This single-center retrospective analysis was conducted on patients with malignancy-associated VTE initiated on treatment with either rivaroxaban or enoxaparin.
View Article and Find Full Text PDFNovel piperidinyl-based sulfamide derivatives were designed and synthesized through various synthetic routes. Anticancer activities of these sulfamides were evaluated by phenotypic screening on National Cancer Institute's 60 human tumor cell lines (NCI-60). Preliminary screening at 10μM concentration showed that piperidinyl sulfamide aminoester 26 (NSC 749204) was sensitive to most of the cell lines in the panel.
View Article and Find Full Text PDF