BMC Bioinformatics
August 2024
Background: Drug discovery and development is the extremely costly and time-consuming process of identifying new molecules that can interact with a biomarker target to interrupt the disease pathway of interest. In addition to binding the target, a drug candidate needs to satisfy multiple properties affecting absorption, distribution, metabolism, excretion, and toxicity (ADMET). Artificial intelligence approaches provide an opportunity to improve each step of the drug discovery and development process, in which the first question faced by us is how a molecule can be informatively represented such that the in-silico solutions are optimized.
View Article and Find Full Text PDFNi-CeO nanoparticles (NPs) are promising nanocatalysts for water splitting and water gas shift reactions due to the ability of ceria to temporarily donate oxygen to the catalytic reaction and accept oxygen after the reaction is completed. Therefore, elucidating how different properties of the Ni-Ceria NPs relate to the activity and selectivity of the catalytic reaction, is of crucial importance for the development of novel catalysts. In this work the active learning (AL) method based on machine learning regression and its uncertainty is used for the global optimization of CeNiO (x = 1, 2, 3) nanoparticles, employing density functional theory calculations.
View Article and Find Full Text PDFReinforcement learning (RL) methods have helped to define the state of the art in the field of modern artificial intelligence, mostly after the breakthrough involving AlphaGo and the discovery of novel algorithms. In this work, we present a RL method, based on Q-learning, for the structural determination of adsorbate@substrate models in silico, where the minimization of the energy landscape resulting from adsorbate interactions with a substrate is made by actions on states (translations and rotations) chosen from an agent's policy. The proposed RL method is implemented in an early version of the reinforcement learning software for materials design and discovery (RLMaterial), developed in Python3.
View Article and Find Full Text PDFThis paper (i) explores the internal structure of two quantum mechanics datasets (QM7b, QM9), composed of several thousands of organic molecules and described in terms of electronic properties, and (ii) further explores an inverse design approach to molecular design consisting of using machine learning methods to approximate the atomic composition of molecules, using QM9 data. Understanding the structure and characteristics of this kind of data is important when predicting the atomic composition from physical-chemical properties in inverse molecular designs. Intrinsic dimension analysis, clustering, and outlier detection methods were used in the study.
View Article and Find Full Text PDFSince the form of the exact functional in density functional theory is unknown, we must rely on density functional approximations (DFAs). In the past, very promising results have been reported by combining semi-local DFAs with exact, i.e.
View Article and Find Full Text PDFStructural elucidation of chemical compounds is challenging experimentally, and theoretical chemistry methods have added important insight into molecules, nanoparticles, alloys, and materials geometries and properties. However, finding the optimum structures is a bottleneck due to the huge search space, and global search algorithms have been used successfully for this purpose. In this work, we present the quantum machine learning software/agent for materials design and discovery (QMLMaterial), intended for automatic structural determination for several chemical systems: atomic clusters, atomic clusters and the spin multiplicity together, doping in clusters or solids, vacancies in clusters or solids, adsorption of molecules or adsorbents on surfaces, and finally atomic clusters on solid surfaces/materials or encapsulated in porous materials.
View Article and Find Full Text PDFDrug design and optimization are challenging tasks that call for strategic and efficient exploration of the extremely vast search space. Multiple fragmentation strategies have been proposed in the literature to mitigate the complexity of the molecular search space. From an optimization standpoint, drug design can be considered as a multi-objective optimization problem.
View Article and Find Full Text PDFGenetic algorithms (GAs) are stochastic global search methods inspired by biological evolution. They have been used extensively in chemistry and materials science coupled with theoretical methods, ranging from force-fields to high-throughput first-principles methods. The methodology allows an accurate and automated structural determination for molecules, atomic clusters, nanoparticles, and solid surfaces, fundamental to understanding chemical processes in catalysis and environmental sciences, for instance.
View Article and Find Full Text PDFThe design of a new therapeutic agent is a time-consuming and expensive process. The rise of machine intelligence provides a grand opportunity of expeditiously discovering novel drug candidates through smart search in the vast molecular structural space. In this paper, we propose a new approach called adversarial deep evolutionary learning (ADEL) to search for novel molecules in the latent space of an adversarial generative model and keep improving the latent representation space.
View Article and Find Full Text PDFFinding the optimum structures of non-stoichiometric or berthollide materials, such as (1D, 2D, 3D) materials or nanoparticles (0D), is challenging due to the huge chemical/structural search space. Computational methods coupled with global optimization algorithms have been used successfully for this purpose. In this work, we have developed an artificial intelligence method based on active learning (AL) or Bayesian optimization for the automatic structural elucidation of vacancies in solids and nanoparticles.
View Article and Find Full Text PDFDrug discovery is a challenging process with a huge molecular space to be explored and numerous pharmacological properties to be appropriately considered. Among various drug design protocols, fragment-based drug design is an effective way of constraining the search space and better utilizing biologically active compounds. Motivated by fragment-based drug search for a given protein target and the emergence of artificial intelligence (AI) approaches in this field, this work advances the field of drug design by (1) integrating a graph fragmentation-based deep generative model with a deep evolutionary learning process for large-scale multi-objective molecular optimization, and (2) applying protein-ligand binding affinity scores together with other desired physicochemical properties as objectives.
View Article and Find Full Text PDFAdsorbate interactions with substrates (e.g. surfaces and nanoparticles) are fundamental for several technologies, such as functional materials, supramolecular chemistry, and solvent interactions.
View Article and Find Full Text PDFAnnu Int Conf IEEE Eng Med Biol Soc
July 2018
Targeted therapy is a treatment that targets the cancer's specific genes, proteins, or the tissue environment that contributes to cancer growth and survival. Identification of therapeutics targets is a very challenging problem in bioinformatics. An integrative and iterative approach for the identification of drug-gene modules (i.
View Article and Find Full Text PDFBackground: Phenotypic studies in Triticeae have shown that low temperature-induced protective mechanisms are developmentally regulated and involve dynamic acclimation processes. Understanding these mechanisms is important for breeding cold-resistant wheat cultivars. In this study, we combined three computational techniques for the analysis of gene expression data from spring and winter wheat cultivars subjected to low temperature treatments.
View Article and Find Full Text PDFBackground: While the gargantuan multi-nation effort of sequencing T. aestivum gets close to completion, the annotation process for the vast number of wheat genes and proteins is in its infancy. Previous experimental studies carried out on model plant organisms such as A.
View Article and Find Full Text PDFBackground: Nowadays, it is possible to collect expression levels of a set of genes from a set of biological samples during a series of time points. Such data have three dimensions: gene-sample-time (GST). Thus they are called 3D microarray gene expression data.
View Article and Find Full Text PDFBackground: Modern high throughput experimental techniques such as DNA microarrays often result in large lists of genes. Computational biology tools such as clustering are then used to group together genes based on their similarity in expression profiles. Genes in each group are probably functionally related.
View Article and Find Full Text PDFBackground: Time series gene expression data analysis is used widely to study the dynamics of various cell processes. Most of the time series data available today consist of few time points only, thus making the application of standard clustering techniques difficult.
Results: We developed two new algorithms that are capable of extracting biological patterns from short time point series gene expression data.
One reason that ovarian cancer is such a deadly disease is because it is not usually diagnosed until it has reached an advanced stage. In this study, we developed a novel algorithm for group biomarkers identification using gene expression data. Group biomarkers consist of coregulated genes across normal and different stage diseased tissues.
View Article and Find Full Text PDF