Publications by authors named "Ramil Nugmanov"

In this work, we propose a versatile molecule and reaction encoding binary data format that aims to bridge the gap between the advantages of SMILES, like local stereo- and implicit hydrogen encoding, and block-structured MDL MOL with a 2D layout and explicit bond encoding, while addressing their respective limitations. Our new format introduces a balance between size efficiency, processing speed, and comprehensive representation, making it well-suited for various applications in cheminformatics, including deep learning, data storage, and searching. By offering an explicit approach to store atom connectivity (including implicit hydrogens), electronic state, stereochemistry, and other crucial molecular attributes, our proposal seeks to enhance data storage efficiency and promote interoperability among different software tools.

View Article and Find Full Text PDF

The kinetic data indicate that the addition of tertiary phosphines to α-methylene lactones in acetic acid is strongly accelerated in comparison to the reactions of related open-chain esters. Six-membered α-methylene-δ-valerolactone exhibited a more pronounced rate increase than five-membered α-methylene-γ-butyrolactone. The use of α-methylene-γ-butyrolactam as a nitrogen analogue of α-methylene-γ-butyrolactone resulted in a total loss of the reaction acceleration.

View Article and Find Full Text PDF

Artificial Intelligence is revolutionizing many aspects of the pharmaceutical industry. Deep learning models are now routinely applied to guide drug discovery projects leading to faster and improved findings, but there are still many tasks with enormous unrealized potential. One such task is the reaction yield prediction.

View Article and Find Full Text PDF

Fluorescent derivatives attract the attention of researchers for their use as sensors, photocatalysts and for the creation of functional materials. In order to create amphiphilic fluorescent derivatives of calixarenes, a fluorescein derivative containing oligoethylene glycol and propargyl groups was obtained. The resulting fluorescein derivative was introduced into three different (thia)calix[4]arene azide derivatives.

View Article and Find Full Text PDF

This work introduces , a new algorithm for reaction atom-to-atom mapping (AAM) based on a transformer neural network adopted for the direct processing of molecular graphs as sets of atoms and bonds, as opposed to SMILES/SELFIES sequence-based approaches, in combination with the Bidirectional Encoder Representations from Transformers (BERT) network. The graph transformer serves to extract molecular features that are tied to atoms and bonds. The BERT network is used for chemical transformation learning.

View Article and Find Full Text PDF

The selection of experimental conditions leading to a reasonable yield is an important and essential element for the automated development of a synthesis plan and the subsequent synthesis of the target compound. The classical QSPR approach, requiring one-to-one correspondence between chemical structure and a target property, can be used for optimal reaction conditions prediction only on a limited scale when only one condition component (e.g.

View Article and Find Full Text PDF

This work introduces CGRdb2.0─an open-source database management system for molecules, reactions, and chemical data. CGRdb2.

View Article and Find Full Text PDF

In this paper, we compare the most popular Atom-to-Atom Mapping (AAM) tools: ChemAxon, Indigo, RDTool, NameRXN (NextMove), and RXNMapper which implement different AAM algorithms. An open-source RDTool program was optimized, and its modified version ("new RDTool") was considered together with several consensus mapping strategies. The Condensed Graph of Reaction approach was used to calculate chemical distances and develop the "AAM fixer" algorithm for an automatized correction of erroneous mapping.

View Article and Find Full Text PDF

A potential hypoxia-sensitive system host-guest complex of three calixarenes (including two with four anionic carboxyl and sulphonate azo fragments on the upper rim and a newly synthesized bis-azo adduct of calixarene in the cone configuration with azo fragments on the lower rim with the most widespread cationic and zwitterionic rhodamine dyes (123, 6G and B)) was studied using UV-VIS spectrometry and fluorescence as well as 1D and 2D NMR techniques. It was found that all three calixarenes form a complex with rhodamine dyes with a 1:1 composition. The association constants of calixarene-dye complexes with sulfonate calixarenes, especially in the case of tetra-anionic calixarene, turned out to be higher compared with carboxyl calixarene due to the more intense electrostatic interactions.

View Article and Find Full Text PDF

Modern QSAR approaches have wide practical applications in drug discovery for designing potentially bioactive molecules. If such models are based on the use of 2D descriptors, important information contained in the spatial structures of molecules is lost. The major problem in constructing models using 3D descriptors is the choice of a putative bioactive conformation, which affects the predictive performance.

View Article and Find Full Text PDF

The quality of experimental data for chemical reactions is a critical consideration for any reaction-driven study. However, the curation of reaction data has not been extensively discussed in the literature so far. Here, we suggest a 4 steps protocol that includes the curation of individual structures (reactants and products), chemical transformations, reaction conditions and endpoints.

View Article and Find Full Text PDF

Understanding the interaction of ions with organic receptors in confined space is of fundamental importance and could advance nanoelectronics and sensor design. In this work, metal ion complexation of conformationally varied thiacalix[4]monocrowns bearing lower-rim hydroxy (type I), dodecyloxy (type II), or methoxy (type III) fragments was evaluated. At the liquid-liquid interface, alkylated thiacalixcrowns-5(6) selectively extract alkali metal ions according to the induced-fit concept, whereas crown-4 receptors were ineffective due to distortion of the crown-ether cavity, as predicted by quantum-chemical calculations.

View Article and Find Full Text PDF

The "creativity" of Artificial Intelligence (AI) in terms of generating de novo molecular structures opened a novel paradigm in compound design, weaknesses (stability & feasibility issues of such structures) notwithstanding. Here we show that "creative" AI may be as successfully taught to enumerate novel chemical reactions that are stoichiometrically coherent. Furthermore, when coupled to reaction space cartography, de novo reaction design may be focused on the desired reaction class.

View Article and Find Full Text PDF

Presently, quantum chemical calculations are widely used to generate extensive data sets for machine learning applications; however, generally, these sets only include information on equilibrium structures and some close conformers. Exploration of potential energy surfaces provides important information on ground and transition states, but analysis of such data is complicated due to the number of possible reaction pathways. Here, we present RePathDB, a database system for managing 3D structural data for both ground and transition states resulting from quantum chemical calculations.

View Article and Find Full Text PDF

Nowadays, the problem of the model's applicability domain (AD) definition is an active research topic in chemoinformatics. Although many various AD definitions for the models predicting properties of molecules (Quantitative Structure-Activity/Property Relationship (QSAR/QSPR) models) were described in the literature, no one for chemical reactions (Quantitative Reaction-Property Relationships (QRPR)) has been reported to date. The point is that a chemical reaction is a much more complex object than an individual molecule, and its yield, thermodynamic and kinetic characteristics depend not only on the structures of reactants and products but also on experimental conditions.

View Article and Find Full Text PDF

Here, we describe a concept of conjugated models for several properties (activities) linked by a strict mathematical relationship. This relationship can be directly integrated analytically into the ridge regression (RR) algorithm or accounted for in a special case of "twin" neural networks (NN). Developed approaches were applied to the modeling of the logarithm of the prototropic tautomeric constant (logK) which can be expressed as the difference between the acidity constants (pKa) of two related tautomers.

View Article and Find Full Text PDF

CGRtools is an open-source Python library aimed to handle molecular and reaction information. It is the sole library developed so far which can process condensed graph of reaction (CGR) handling. CGR provides the possibility for advanced operations with reaction information and could be used for reaction descriptor calculation, structure-reactivity modeling, atom-to-atom mapping comparison and correction, reaction center extraction, reaction balancing, and some other related tasks.

View Article and Find Full Text PDF

Here, we report the data visualization, analysis and modeling for a large set of 4830 S 2 reactions the rate constant of which (logk) was measured at different experimental conditions (solvent, temperature). The reactions were encoded by one single molecular graph - Condensed Graph of Reactions, which allowed us to use conventional chemoinformatics techniques developed for individual molecules. Thus, Matched Reaction Pairs approach was suggested and used for the analyses of substituents effects on the substrates and nucleophiles reactivity.

View Article and Find Full Text PDF

The synthesis of new calix[4]arenes adopting a stereoisomeric form bearing two or four azide fragments on the upper rim and water-soluble triazolyl amphiphilic receptors with two or four polyammonium headgroups via copper-catalyzed azide-alkyne cycloaddition reaction has been performed for the first time. It was found that the synthesized macrocycles form stable aggregates with hydrodynamic diameters between 150-200 nm and electrokinetic potentials about +40 to +60 mV in water solutions. Critical aggregation concentration (CAC) values were measured using a micelle method with pyrene and eosin Y as dye probes.

View Article and Find Full Text PDF

This paper reports SVR (Support Vector Regression) and GTM (Generative Topographic Mapping) modeling of three kinetic properties of cycloaddition reactions: rate constant (logk), activation energy (Ea) and pre-exponential factor (logA). A data set of 1849 reactions, comprising (4+2), (3+2) and (2+2) cycloadditions (CA) were studied in different solvents and at different temperatures. The reactions were encoded by the ISIDA fragment descriptors generated for Condensed Graph of Reaction (CGR).

View Article and Find Full Text PDF

Generative Topographic Mapping (GTM) approach was successfully used to visualize, analyze and model the equilibrium constants (K ) of tautomeric transformations as a function of both structure and experimental conditions. The modeling set contained 695 entries corresponding to 350 unique transformations of 10 tautomeric types, for which K values were measured in different solvents and at different temperatures. Two types of GTM-based classification models were trained: first, a "structural" approach focused on separating tautomeric classes, irrespective of reaction conditions, then a "general" approach accounting for both structure and conditions.

View Article and Find Full Text PDF

We describe a novel approach of reaction representation as a combination of two mixtures: a mixture of reactants and a mixture of products. In turn, each mixture can be encoded using an earlier reported approach involving simplex descriptors (SiRMS). The feature vector representing these two mixtures results from either concatenated product and reactant descriptors or the difference between descriptors of products and reactants.

View Article and Find Full Text PDF

We report a new method to assess protective groups (PGs) reactivity as a function of reaction conditions (catalyst, solvent) using raw reaction data. It is based on an intuitive similarity principle for chemical reactions: similar reactions proceed under similar conditions. Technically, reaction similarity can be assessed using the Condensed Graph of Reaction (CGR) approach representing an ensemble of reactants and products as a single molecular graph, i.

View Article and Find Full Text PDF