Publications by authors named "Daniil Polykovskiy"

Large Language Models (LLMs) have substantially driven scientific progress in various domains, and many papers have demonstrated their ability to tackle complex problems with creative solutions. Our paper introduces a new foundation model, nach0, capable of solving various chemical and biological tasks: biomedical question answering, named entity recognition, molecular generation, molecular synthesis, attributes prediction, and others. nach0 is a multi-domain and multi-task encoder-decoder LLM pre-trained on unlabeled text from scientific literature, patents, and molecule strings to incorporate a range of chemical and linguistic knowledge.

View Article and Find Full Text PDF

The fast and accurate conformation space modeling is an essential part of computational approaches for solving ligand and structure-based drug discovery problems. Recent state-of-the-art diffusion models for molecular conformation generation show promising distribution coverage and physical plausibility metrics but suffer from a slow sampling procedure. We propose a novel adversarial generative framework, COSMIC, that shows comparable generative performance but provides a time-efficient sampling and training procedure.

View Article and Find Full Text PDF

Idiopathic pulmonary fibrosis (IPF) is an aggressive interstitial lung disease with a high mortality rate. Putative drug targets in IPF have failed to translate into effective therapies at the clinical level. We identify TRAF2- and NCK-interacting kinase (TNIK) as an anti-fibrotic target using a predictive artificial intelligence (AI) approach.

View Article and Find Full Text PDF

PandaOmics is a cloud-based software platform that applies artificial intelligence and bioinformatics techniques to multimodal omics and biomedical text data for therapeutic target and biomarker discovery. PandaOmics generates novel and repurposed therapeutic target and biomarker hypotheses with the desired properties and is available through licensing or collaboration. Targets and biomarkers generated by the platform were previously validated in both and studies.

View Article and Find Full Text PDF

Drug discovery and development is a notoriously risky process with high failure rates at every stage, including disease modeling, target discovery, hit discovery, lead optimization, preclinical development, human safety, and efficacy studies. Accurate prediction of clinical trial outcomes may help significantly improve the efficiency of this process by prioritizing therapeutic programs that are more likely to succeed in clinical trials and ultimately benefit patients. Here, we describe inClinico, a transformer-based artificial intelligence software platform designed to predict the outcome of phase II clinical trials.

View Article and Find Full Text PDF

In recent years, drug discovery and life sciences have been revolutionized with machine learning and artificial intelligence (AI) methods. Quantum computing is touted to be the next most significant leap in technology; one of the main early practical applications for quantum computing solutions is predicted to be in quantum chemistry simulations. Here, we review the near-term applications of quantum computing and their advantages for generative chemistry and highlight the challenges that can be addressed with noisy intermediate-scale quantum (NISQ) devices.

View Article and Find Full Text PDF

The application of artificial intelligence (AI) has been considered a revolutionary change in drug discovery and development. In 2020, the AlphaFold computer program predicted protein structures for the whole human genome, which has been considered a remarkable breakthrough in both AI applications and structural biology. Despite the varying confidence levels, these predicted structures could still significantly contribute to structure-based drug design of novel targets, especially the ones with no or limited structural information.

View Article and Find Full Text PDF

Chemistry42 is a software platform for small molecule design and optimization that integrates Artificial Intelligence (AI) techniques with computational and medicinal chemistry methodologies. Chemistry42 efficiently generates novel molecular structures with optimized properties validated in both and studies and is available through licensing or collaboration. Chemistry42 is the core component of Insilico Medicine's drug discovery suite.

View Article and Find Full Text PDF

Generative models are becoming a tool of choice for exploring the molecular space. These models learn on a large training dataset and produce novel molecular structures with similar properties. Generated structures can be utilized for virtual screening or training semi-supervized predictive models in the downstream tasks.

View Article and Find Full Text PDF

Gene expression profiles are useful for assessing the efficacy and side effects of drugs. In this paper, we propose a new generative model that infers drug molecules that could induce a desired change in gene expression. Our model-the Bidirectional Adversarial Autoencoder-explicitly separates cellular processes captured in gene expression changes into two feature sets: those and to the drug incubation.

View Article and Find Full Text PDF

We have developed a deep generative model, generative tensorial reinforcement learning (GENTRL), for de novo small-molecule design. GENTRL optimizes synthetic feasibility, novelty, and biological activity. We used GENTRL to discover potent inhibitors of discoidin domain receptor 1 (DDR1), a kinase target implicated in fibrosis and other diseases, in 21 days.

View Article and Find Full Text PDF

Modern computational approaches and machine learning techniques accelerate the invention of new drugs. Generative models can discover novel molecular structures within hours, while conventional drug discovery pipelines require months of work. In this article, we propose a new generative architecture, entangled conditional adversarial autoencoder, that generates molecular structures based on various properties, such as activity against a specific protein, solubility, or ease of synthesis.

View Article and Find Full Text PDF

Convolutional neural networks (CNN) have been successfully used to handle three-dimensional data and are a natural match for data with spatial structure such as 3D molecular structures. However, a direct 3D representation of a molecule with atoms localized at voxels is too sparse, which leads to poor performance of the CNNs. In this work, we present a novel approach where atoms are extended to fill other nearby voxels with a transformation based on the wave transform.

View Article and Find Full Text PDF