Deep learning and generative methods in cheminformatics and chemical biology: navigating small molecule space intelligently.

Biochem J

Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, Faculty of Health and Life Sciences, University of Liverpool, Crown St, Liverpool L69 7ZB, U.K.

Published: December 2020

The number of 'small' molecules that may be of interest to chemical biologists - chemical space - is enormous, but the fraction that have ever been made is tiny. Most strategies are discriminative, i.e. have involved 'forward' problems (have molecule, establish properties). However, we normally wish to solve the much harder generative or inverse problem (describe desired properties, find molecule). 'Deep' (machine) learning based on large-scale neural networks underpins technologies such as computer vision, natural language processing, driverless cars, and world-leading performance in games such as Go; it can also be applied to the solution of inverse problems in chemical biology. In particular, recent developments in deep learning admit the in silico generation of candidate molecular structures and the prediction of their properties, thereby allowing one to navigate (bio)chemical space intelligently. These methods are revolutionary but require an understanding of both (bio)chemistry and computer science to be exploited to best advantage. We give a high-level (non-mathematical) background to the deep learning revolution, and set out the crucial issue for chemical biology and informatics as a two-way mapping from the discrete nature of individual molecules to the continuous but high-dimensional latent representation that may best reflect chemical space. A variety of architectures can do this; we focus on a particular type known as variational autoencoders. We then provide some examples of recent successes of these kinds of approach, and a look towards the future.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7733676PMC
http://dx.doi.org/10.1042/BCJ20200781DOI Listing

Publication Analysis

Top Keywords

deep learning
12
chemical biology
12
space intelligently
8
chemical space
8
chemical
6
learning generative
4
generative methods
4
methods cheminformatics
4
cheminformatics chemical
4
biology navigating
4

Similar Publications

The rising incidence of pancreatic diseases, including acute and chronic pancreatitis and various pancreatic neoplasms, poses a significant global health challenge. Pancreatic ductal adenocarcinoma (PDAC) for example, has a high mortality rate due to late-stage diagnosis and its inaccessible location. Advances in imaging technologies, though improving diagnostic capabilities, still necessitate biopsy confirmation.

View Article and Find Full Text PDF

Purpose: To develop and validate a prostate-specific membrane antigen (PSMA) PET/CT based multimodal deep learning model for predicting pathological lymph node invasion (LNI) in prostate cancer (PCa) patients identified as candidates for extended pelvic lymph node dissection (ePLND) by preoperative nomograms.

Methods: [Ga]Ga-PSMA-617 PET/CT scan of 116 eligible PCa patients (82 in the training cohort and 34 in the test cohort) who underwent radical prostatectomy with ePLND were analyzed in our study. The Med3D deep learning network was utilized to extract discriminative features from the entire prostate volume of interest on the PET/CT images.

View Article and Find Full Text PDF

In response to the pressing need for the detection of Monkeypox caused by the Monkeypox virus (MPXV), this study introduces the Enhanced Spatial-Awareness Capsule Network (ESACN), a Capsule Network architecture designed for the precise multi-class classification of dermatological images. Addressing the shortcomings of traditional Machine Learning and Deep Learning models, our ESACN model utilizes the dynamic routing and spatial hierarchy capabilities of CapsNets to differentiate complex patterns such as those seen in monkeypox, chickenpox, measles, and normal skin presentations. CapsNets' inherent ability to recognize and process crucial spatial relationships within images outperforms conventional CNNs, particularly in tasks that require the distinction of visually similar classes.

View Article and Find Full Text PDF

Enhancing the ion conduction in solid electrolytes is critically important for the development of high-performance all-solid-state lithium-ion batteries (LIBs). Lithium thiophosphates are among the most promising solid electrolytes, as they exhibit superionic conductivity at room temperature. However, the lack of comprehensive understanding of their ion conduction mechanism, especially the effect of structural disorder on ionic conductivity, is a long-standing problem that limits further innovations in all-solid-state LIBs.

View Article and Find Full Text PDF

The burden of undiagnosed diabetes mellitus (DM) is substantial, with approximately 240 million individuals globally unaware of their condition, disproportionately affecting low- and middle-income countries (LMICs), including Indonesia. Without screening, DM and its complications will impose significant pressure on healthcare systems. Current clinical practices for screening and diagnosing DM primarily involve blood or laboratory-based testing which possess limitations on access and cost.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!