Large-scale prediction of activity cliffs using machine and deep learning methods of increasing complexity.

J Cheminform

Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, 53115, Bonn, Germany.

Published: January 2023

Activity cliffs (AC) are formed by pairs of structural analogues that are active against the same target but have a large difference in potency. While much of our knowledge about ACs has originated from the analysis and comparison of compounds and activity data, several studies have reported AC predictions over the past decade. Different from typical compound classification tasks, AC predictions must be carried out at the level of compound pairs representing ACs or nonACs. Most AC predictions reported so far have focused on individual methods or comparisons of two or three approaches and only investigated a few compound activity classes (from 2 to 10). Although promising prediction accuracy has been reported in most cases, different system set-ups, AC definitions, methods, and calculation conditions were used, precluding direct comparisons of these studies. Therefore, we have carried out a large-scale AC prediction campaign across 100 activity classes comparing machine learning methods of greatly varying complexity, ranging from pair-based nearest neighbor classifiers and decision tree or kernel methods to deep neural networks. The results of our systematic predictions revealed the level of accuracy that can be expected for AC predictions across many different compound classes. In addition, prediction accuracy did not scale with methodological complexity but was significantly influenced by memorization of compounds shared by different ACs or nonACs. In many instances, limited training data were sufficient for building accurate models using different methods and there was no detectable advantage of deep learning over simpler approaches for AC prediction. On a global scale, support vector machine models performed best, by only small margins compared to others including simple nearest neighbor classifiers.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9825040PMC
http://dx.doi.org/10.1186/s13321-022-00676-7DOI Listing

Publication Analysis

Top Keywords

large-scale prediction
8
activity cliffs
8
deep learning
8
learning methods
8
acs nonacs
8
activity classes
8
prediction accuracy
8
nearest neighbor
8
neighbor classifiers
8
methods
6

Similar Publications

Convergence of nanotechnology and artificial intelligence in the fight against liver cancer: a comprehensive review.

Discov Oncol

January 2025

Department of Pharmaceutics, Datta Meghe College of Pharmacy, Datta Meghe Institute of Higher Education and Research (DU), Sawangi Meghe, Wardha, Maharashtra, 442001, India.

Liver cancer is one of the most challenging malignancies, often associated with poor prognosis and limited treatment options. Recent advancements in nanotechnology and artificial intelligence (AI) have opened new frontiers in the fight against this disease. Nanotechnology enables precise, targeted drug delivery, enhancing the efficacy of therapeutics while minimizing off-target effects.

View Article and Find Full Text PDF

The development of a screener for Cerebral Visual Impairment.

Appl Neuropsychol Child

January 2025

Luxembourg Centre for Educational Testing (LUCET), Faculty of Humanities, Education and Social Sciences, University of Luxembourg, Esch-sur-Alzette, Luxembourg.

This study explored the secondary use of Luxembourg's school monitoring tool for a large-scale screening of Cerebral Visual Impairment (CVI)-related difficulties. 44 items, with and without time constraint, were developed, and pretested among 959 children. All children subsequently participated in an individual evaluation of higher-level visual processing (HLVP) measures related with CVI.

View Article and Find Full Text PDF

Typhoid fever is a significant public health problem endemic in Southeast Asia and Sub-Saharan Africa. Antimicrobial treatment of typhoid is however threatened by the increasing prevalence of antimicrobial resistant (AMR) Typhi, especially in the globally successful lineage (4.3.

View Article and Find Full Text PDF

Purpose: Population level tracking of post-stroke functional outcomes is critical to guide interventions that reduce the burden of stroke-related disability. However, functional outcomes are often missing or documented in unstructured notes. We developed a natural language processing (NLP) model that reads electronic health records (EHR) notes to automatically determine the modified Rankin Scale (mRS).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!