Active learning (AL) has become a subject of active recent research both in industry and academia as an efficient approach for rapid design and discovery of novel chemicals, materials, and polymers. Herein, we have assessed the applicability of AL for the discovery of polymeric micelle formulations for poorly soluble drugs. We were motivated by the key advantages of this approach making it a desirable strategy for rational design of drug delivery systems due toto its ability to (i) employ relatively small datasets for model development, (ii) iterate between model development and model assessment using small external datasets that can be either generated in focused experimental studies or formed from subsets of the initial training data, and (iii) progressively evolve models towards increasingly more reliable predictions and the identification of novel chemicals with the desired properties.
View Article and Find Full Text PDFIn this paper, we compare the most popular Atom-to-Atom Mapping (AAM) tools: ChemAxon, Indigo, RDTool, NameRXN (NextMove), and RXNMapper which implement different AAM algorithms. An open-source RDTool program was optimized, and its modified version ("new RDTool") was considered together with several consensus mapping strategies. The Condensed Graph of Reaction approach was used to calculate chemical distances and develop the "AAM fixer" algorithm for an automatized correction of erroneous mapping.
View Article and Find Full Text PDFNowadays, the problem of the model's applicability domain (AD) definition is an active research topic in chemoinformatics. Although many various AD definitions for the models predicting properties of molecules (Quantitative Structure-Activity/Property Relationship (QSAR/QSPR) models) were described in the literature, no one for chemical reactions (Quantitative Reaction-Property Relationships (QRPR)) has been reported to date. The point is that a chemical reaction is a much more complex object than an individual molecule, and its yield, thermodynamic and kinetic characteristics depend not only on the structures of reactants and products but also on experimental conditions.
View Article and Find Full Text PDFPharmacophore modeling is usually considered as a special type of virtual screening without probabilistic nature. Correspondence of at least one conformation of a molecule to pharmacophore is considered as evidence of its bioactivity. We show that pharmacophores can be treated as one-class machine learning models, and the probability the reflecting model's confidence can be assigned to a pharmacophore on the basis of their precision of active compounds identification on a calibration set.
View Article and Find Full Text PDFHere, we describe a concept of conjugated models for several properties (activities) linked by a strict mathematical relationship. This relationship can be directly integrated analytically into the ridge regression (RR) algorithm or accounted for in a special case of "twin" neural networks (NN). Developed approaches were applied to the modeling of the logarithm of the prototropic tautomeric constant (logK) which can be expressed as the difference between the acidity constants (pKa) of two related tautomers.
View Article and Find Full Text PDF